To see the other types of publications on this topic, follow the link: Bioinformatics Visualization.

Dissertations / Theses on the topic 'Bioinformatics Visualization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 45 dissertations / theses for your research on the topic 'Bioinformatics Visualization.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Rohrschneider, Markus. "Visualization of Metabolic Networks." Doctoral thesis, Universitätsbibliothek Leipzig, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-160528.

Full text
Abstract:
The metabolism constitutes the universe of biochemical reactions taking place in a cell of an organism. These processes include the synthesis, transformation, and degradation of molecules for an organism to grow, to reproduce and to interact with its environment. A good way to capture the complexity of these processes is the representation as metabolic network, in which sets of molecules are transformed into products by a chemical reaction, and the products are being processed further. The underlying graph model allows a structural analysis of this network using established graphtheoretical algorithms on the one hand, and a visual representation by applying layout algorithms combined with information visualization techniques on the other. In this thesis we will take a look at three different aspects of graph visualization within the context of biochemical systems: the representation and interactive exploration of static networks, the visual analysis of dynamic networks, and the comparison of two network graphs. We will demonstrate, how established infovis techniques can be combined with new algorithms and applied to specific problems in the area of metabolic network visualization. We reconstruct the metabolic network covering the complete set of chemical reactions present in a generalized eucaryotic cell from real world data available from a popular metabolic pathway data base and present a suitable data structure. As the constructed network is very large, it is not feasible for the display as a whole. Instead, we introduce a technique to analyse this static network in a top-down approach starting with an overview and displaying detailed reaction networks on demand. This exploration method is also applied to compare metabolic networks in different species and from different resources. As for the analysis of dynamic networks, we present a framework to capture changes in the connectivity as well as changes in the attributes associated with the network’s elements.
APA, Harvard, Vancouver, ISO, and other styles
2

McDowell, Graeme S. V. "Advancing Lipidomic Bioinformatics: Visualization and phosphoLipid IDentification (VaLID)." Thesis, Université d'Ottawa / University of Ottawa, 2015. http://hdl.handle.net/10393/32203.

Full text
Abstract:
Lipidomics is a relatively new field under the heading of systems biology. Due to its infancy, the field suffers from significant ‘growing pains’, one of which is the lack of bioinformatic analytic resources that other “-omics” fields enjoy. Here, I describe the creation and validation of the glycerophospholipid identification program VaLID. Using an in silico approach, we generated a comprehensive database containing all of the glycerophospholipids within multiple sub-classes: those containing chains of 0 to 30 carbons with up to 6 unsaturations and various linkages. Using Java, I created a web- based computer interface with a search engine and a visualization tool to access this database. In comparing results to current programs, I found that VaLID consistently contained more identity predictions than did the current gold standard LipidMAPS. Results from several tests with real datasets confirm that VaLID is more than capable as a phospholipid identification tool for use in lipidomics.
APA, Harvard, Vancouver, ISO, and other styles
3

Rönnbrant, Anders. "Implementing a visualization tool for myocardial strain tensors." Thesis, Linköping University, Department of Biomedical Engineering, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-5173.

Full text
Abstract:
<p>The heart is a complex three-dimensional structure with mechanical properties that are inhomogeneous, non-linear, time-variant and anisotropic. These properties affect major physiological factors within the heart, such as the pumping performance of the ventricles, the oxygen demand in the tissue and the distribution of coronary blood flow.</p><p>During the cardiac cycle the heart muscle tissue is deformed as a consequence of the active contraction of the muscle fibers and their relaxation respectively. A mapping of this deformation would give increased understanding of the mechanical properties of the heart. The deformation induces strain and stress in the tissue which are both mechanical properties and can be described with a mathematical tensor object.</p><p>The aim of this master's thesis is to develop a visualization tool for the strain tensor objects that can aid a user to see and/or understand various differences between different hearts and spatial and temporal differences within the same heart. Preferably should the tool be general enough for use with different types of data.</p>
APA, Harvard, Vancouver, ISO, and other styles
4

Ding, Hao. "Visualization and Integrative analysis of cancer multi-omics data." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1467843712.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Feng. "Platform Independent Real-Time X3D Shaders and their Applications in Bioinformatics Visualization." Digital Archive @ GSU, 2007. http://digitalarchive.gsu.edu/cs_diss/24.

Full text
Abstract:
Since the introduction of programmable Graphics Processing Units (GPUs) and procedural shaders, hardware vendors have each developed their own individual real-time shading language standard. None of these shading languages is fully platform independent. Although this real-time programmable shader technology could be developed into 3D application on a single system, this platform dependent limitation keeps the shader technology away from 3D Internet applications. The primary purpose of this dissertation is to design a framework for translating different shader formats to platform independent shaders and embed them into the eXtensible 3D (X3D) scene for 3D web applications. This framework includes a back-end core shader converter, which translates shaders among different shading languages with a middle XML layer. Also included is a shader library containing a basic set of shaders that developers can load and add shaders to. This framework will then be applied to some applications in Biomolecular Visualization.
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Wei. "Visualization of Clinically Annotated Electrophysiological Data for Multi-Center Sleep Studies." Case Western Reserve University School of Graduate Studies / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=case1436288274.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Shi, Jieming. "Novel bioinformatics tools for miRNA-Seq analysis, RNA structure visualization, and genome-wide repeat detection." Miami University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=miami15003113547315.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Indukuri, Kiran Kumar. "Fusion: a Visualization Framework for Interactive Ilp Rule Mining With Applications to Bioinformatics." Thesis, Virginia Tech, 2004. http://hdl.handle.net/10919/36326.

Full text
Abstract:
Microarrays provide biologists an opportunity to find the expression profiles of thousands of genes simultaneously. Biologists try to understand the mechanisms underlying the life processes by finding out relationships between gene-expression and their functional categories. Fusion is a software system that aids the biologists in performing microarray data analysis by providing them with both visual data exploration and data mining capabilities. Its multiple view visual framework allows the user to choose different views for different types of data. Fusion uses Proteus, an Inductive Logic Programming (ILP) rule finding algorithm to mine relationships in the microarray data. Fusion allows the user to explore the data interactively, choose biases, run the data mining algorithms and visualize the discovered rules. Fusion has the capability to smoothly switch across interactive data exploration and batch data mining modes. This optimizes the knowledge discovery process by facilitating a synergy between the interactivity and usability of visualization process with the pattern-finding abilities of ILP rule mining algorithms. Fusion was successful in helping biologists better understand the mechanisms underlying the acclimatization of certain varieties of Arabidopsis to ozone exposure.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
9

Wang, Jeremy R. "Analysis and Visualization of Local Phylogenetic Structure within Species." Thesis, The University of North Carolina at Chapel Hill, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3562960.

Full text
Abstract:
<p> While it is interesting to examine the evolutionary history and phylogenetic relationship between species, for example, in a sort of "tree of life", there is also a great deal to be learned from examining population structure and relationships within species. A careful description of phylogenetic relationships within species provides insights into causes of phenotypic variation, including disease susceptibility. The better we are able to understand the patterns of genotypic variation within species, the better these populations may be used as models to identify causative variants and possible therapies, for example through targeted genome-wide association studies (GWAS). My thesis describes a model of local phylogenetic structure, how it can be effectively derived under various circumstances, and useful applications and visualizations of this model to aid genetic studies. </p><p> I introduce a method for discovering phylogenetic structure among individuals of a population by partitioning the genome into a minimal set of intervals within which there is no evidence of recombination. I describe two extensions of this basic method. The first allows it to be applied to heterozygous, in addition to homozygous, genotypes and the second makes it more robust to errors in the source genotypes. </p><p> I demonstrate the predictive power of my local phylogeny model using a novel method for genome-wide genotype imputation. This imputation method achieves very high accuracy&mdash;on the order of the accuracy rate in the sequencing technology&mdash;by imputing genotypes in regions of shared inheritance based on my local phylogenies. </p><p> Comparative genomic analysis within species can be greatly aided by appropriate visualization and analysis tools. I developed a framework for web-based visualization and analysis of multiple individuals within a species, with my model of local phylogeny providing the underlying structure. I will describe the utility of these tools and the applications for which they have found widespread use.</p>
APA, Harvard, Vancouver, ISO, and other styles
10

Baker, Frazier N. "Mining and Visualization of Amino Acid Coevolution Data." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1571061614939124.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Lunev, Alexey. "Evaluation and visualization of complexity in parameter setting in automotive industry." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-362711.

Full text
Abstract:
Parameter setting is a process primary used to specify in what kind of vehicle an electronic control unit of each type is used. This thesis is targeted to investigate whether the current strategy to measure complexity gives user satisfactory results. The strategy consists of structure-based algorithms that are an essential part of the Complexity Analyzer - a prototype application used to evaluate the complexity.     The results described in this work suggest that the currently implemented algorithms have to be properly defined and adapted to be used in terms of parameter setting. Moreover, the measurements that the algorithms output has been analyzed in more detail making the results easier to interpret.     It has been shown that a typical parameter setting file can be regarded as a tree structure. To measure variation in this structure a new concept, called Path entropy has been formulated, tested and implemented.     The main disadvantage of the original version of the Complexity Analyzer application is its lack of user-friendliness. Therefore, a web version of the application based on Model-View-Controller technique has been developed. Different to the original version it has user interface included and it takes just a couple of seconds to see the visualization of data, compared to the original version where it took several minutes to run the application.
APA, Harvard, Vancouver, ISO, and other styles
12

Sheth, Vrunda. "Visualization of protein 3D structures in reduced represetnation with simultaneous display of intra and inter-molecular interactions /." Online version of thesis, 2009. http://hdl.handle.net/1850/10857.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

xinjian, qi. "COMPUTATIONAL ANALYSIS, VISUALIZATION AND TEXT MINING OF METABOLIC NETWORKS." Case Western Reserve University School of Graduate Studies / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=case1378479338.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Rose, Jarod. "An Investigation and Visualization of MicroRNA Targets and Gene Expressions and Their Use in Classifying Cancer Samples." University of Akron / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=akron1302303717.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

King, James Lowell. "Gene Ontology-Guided Force-Directed Visualization of Protein Interaction Networks." Diss., NSUWorks, 2019. https://nsuworks.nova.edu/gscis_etd/1066.

Full text
Abstract:
Protein interaction data is being generated at unprecedented rates thanks to advancements made in high throughput techniques such as mass spectrometry and DNA microarrays. Biomedical researchers, operating under budgetary constraints, have found it difficult to scale their efforts to keep up with the ever-increasing amount of available data. They often lack the resources and manpower required to analyze the data using existing methodologies. These research deficiencies impede our ability to understand diseases, delay the advancement of clinical therapeutics, and ultimately costs lives. One of the most commonly used techniques to analyze protein interaction data is the construction and visualization of protein interaction networks. This research investigated the effectiveness and efficiency of novel domain-specific algorithms for visualizing protein interaction networks. The existing domain-agnostic algorithms were compared to the novel algorithms using several performance, aesthetic, and biological relevance metrics. The graph drawing algorithms proposed here introduced novel domain-specific forces to the existing force-directed graph drawing algorithms. The innovations include an attractive force and graph coarsening policy based on semantic similarity, and a novel graph refinement algorithm. These experiments have demonstrated that the novel graph drawing algorithms consistently produce more biologically meaningful layouts than the existing methods. Aggregated over the 480 tests performed, and quantified using the Biological Evaluation Percentage metric defined in the Methodology chapter, the novel graph drawing algorithms created layouts that are 237 percent more biologically meaningful than the next best algorithm. This improvement came at the cost of additional edge crossings and smaller minimum angles between adjacent edges, both of which are undesirable aesthetics. The aesthetic and performance tradeoffs are experimentally quantified in this study, and dozens of algorithmically generated graph drawings are presented to visually illustrate the benefits of the novel algorithms. The graph drawing algorithms proposed in this study will help biomedical researchers to more efficiently produce high quality interactive protein interaction network drawings for improved discovery and communication.
APA, Harvard, Vancouver, ISO, and other styles
16

Narayanan, Kanchana. "MAVEN: a tool for Visualization and Functional Analysis of Genome-Wide Association Studies." Cleveland, Ohio : Case Western Reserve University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=case1269455528.

Full text
Abstract:
Thesis (Master of Sciences)--Case Western Reserve University, 2010<br>Department of EECS - Computer and Information Sciences Title from PDF (viewed on 2010-05-25) Includes abstract Includes bibliographical references and appendices Available online via the OhioLINK ETD Center
APA, Harvard, Vancouver, ISO, and other styles
17

Yoo, Hyun Seung. "Color Illusions on Liquid Crystal Displays and Design Guidelines for Information Visualization." Thesis, Virginia Tech, 2007. http://hdl.handle.net/10919/36372.

Full text
Abstract:
<p>The influence of color on size and depth perception has been explored for a century, but there is very limited research on interventions that can reduce the color illusions. This study was motivated to identify interventions and propose design guidelines for information visualization, especially where size judgment is critical. </p> <p>This study replicated the color size illusion and color depth illusion on an LCD monitor and it was found that yellow is the smallest and farthest color among red, yellow, green, and blue on a white background. Three types of interventions (background brightness, border color, and background grid brightness) were tested to identify the conditions that reduce the color illusions, but all of them were not statistically significant. </p> <p>Based on the experiment results and literature survey, design guidelines were proposed. To extend the guidelines to the bioinformatics field, design recommendations were proposed and implementation examples were illustrated. Evaluations on design implementations were evaluated by interviewing domain experts. </p> <p>Additionally, the relationship between the color size illusion and the color depth illusion was explored.</p><br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
18

Sutharzan, Sreeskandarajan. "CLUSTERING AND VISUALIZATION OF GENOMIC DATA." Miami University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=miami1563973517163859.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Janowski, Sebastian Jan [Verfasser]. "VANESA - A bioinformatics software application for the modeling, visualization, analysis, and simulation of biological networks in systems biology applications / Sebastian Jan Janowski." Bielefeld : Universitaetsbibliothek Bielefeld, 2013. http://d-nb.info/1036112020/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Stokes, Todd Hamilton. "Development of a visualization and information management platform in translational biomedical informatics." Diss., Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/33967.

Full text
Abstract:
Translational Biomedical Informatics (TBMI) is an emerging discipline expanding beyond traditional bioinformatics, with a focus on developing computational technologies for real-world biomedical practice. The goal of my Ph.D. research is to address a few key challenges in TBI, including: (1) the high quality and reproducibility required by medical applications when processing high throughput data, (2) the need for knowledge management solutions that allow molecular data to be handled and evaluated by researchers, regulators, and doctors collectively, (3) the need for near real-time, efficient access to decision-oriented visualizations of integrated data and data processing results, and (4) the need for an integrated solution that can evolve as medical consensus evolves, without requiring retraining, overhaul or replacement. This dissertation resulted in the development and adoption of concrete web-based application deliverables in regular use by bioinformaticians, clinicians, biologists and nanotechnologists. These include: the Chip Artifact Correction (caCORRECT) web site and grid services, the ArrayWiki community microarray repository, and the SimpleVisGrid visualization grid services (including eGOMiner, nanoDRIVE, PathwayVis and SphingoVisGrid).
APA, Harvard, Vancouver, ISO, and other styles
21

Dabdoub, Shareef Majed. "Applied Visual Analytics in Molecular, Cellular, and Microbiology." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1322602183.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Guo, Chen. "Improve the Diagnosis on Fundus Photography with Deep Transfer Learning." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1621674781655785.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Ràmia, Jesús Miquel. "Visualization, description and analysis of the genome variation of a natural population of Drosophila melanogaster." Doctoral thesis, Universitat Autònoma de Barcelona, 2015. http://hdl.handle.net/10803/322822.

Full text
Abstract:
La descripció i explicació de la variació genètica dins i entre poblacions, l’objectiu de la genètica de poblacions des del seu origen, s’ha vist frenat durant dècades degut a la impossibilitat tècnica de mesurar directament la variació genètica de les poblacions. La present era genòmica, amb el creixement explosiu de genomes sequenciats alimentat per les tecnologies de seqüenciació de nova generació, han obert el camí a la present època daurada de l’estudi de la variació genètica a escala genòmica. La genètica de poblacions ja no és una ciència amb mancances empíriques, si no que és més que mai un camp de recerca on les eines bioinformàtiques per mineria de dades i gestió de grans volums de dades, models estadístics i evolutius, i noves tècniques moleculars de generació de seqüències, queden totes integrades en una empresa interdisciplinària. Com a conseqüència d’aquest avenç, una nova disciplina ‘òmica’ ha aparegut: La Genòmica de Poblacions. Però, què és la Genòmica de Poblacions? Segons Charlesworth (2010), simplement és “un nou terme per un camp d’estudi tant antic com la pròpia Genètica”. Es tracta de “l’antic camp” de la Genètica de Poblacions, quan l’estudi de la quantitat i les causes de la variabilitat natural a les poblacions és fa des d’una prespectiva genòmica. Aquesta tesis és tant un estudi de genòmica de poblacions com un projecte bioinformàtic centrat en la visualització, descripció i anàlisis de la variació del DNA a tot el genoma, a partir de dades d’una població natural de l’organisme model Drosophila melanogaster. Les dades utilitzades s’han obtingut a la iniciativa internacional Drosophila Genetic Reference Panel (DGRP) (Mackay et al. 2012). El DGRP ja seqüenciat els genomes complets de 158 (primera fase) i 205 (segona fase) línies consanguínies de D. melanogaster provinents d’una població natural a Raleigh (Estats Units). Un dels objectius principals del projecte va ser la creació d’un recurs de dades del polimorfisme genètic per a realitzar anàlisis de genòmica de poblacions. Les dades de seqüències del DGRP ens ha permès ralitzar un complet estudi de la variació a nivell genòmic a una població natural de D. melanogaster. Després de desenvolupar un complert accessible mapa públic del polimorfisme present en aquesta població, hem descrit els patrons de polimorfisme i divergència (tant de variants nucleotídiques com d’insercions i delecions, índels) al llarg dels braços cromosòmics. Observem un patró clar i consistent de diversitat genòmica al llarg dels autosomes tant per SNPs (varició d’un sol nucleótid) i índels: la diversitat es veu reduïda a les zones centromèriques en comparació a les no centromèriques, i també als telòmers. Aquest patró no s’observa al cromosoma X, on la diversitat és gairebé uniforme al llarg del cromosoma. Polimorfisme i recombinació es troben correlacionats al llarg dels braços cromosomis, però només en aquelles regions amb taxa de recombinació per sota 2cM Mb-1. La taxa de recombinació sembla ser la força principal responsable de donar forma als patrons de polimorfisme als cromosomes i el seu efecte sembla estar mediat pel seu impacte a la selecció lligada. Hem mapejat la petjada de la selecció natural a SNPs i índels a tot el genoma, observant una acció arreu de la selecció natural, tant per selecció adaptativa com purificadora. La selecció adaptativa actua preferentment a zones no centromèriques. La selecció natural actua diferent a insercions i delecions, sent les delecions seleccionades més intensament en contra per la selecció purificadora, fet que suporta la teoria de l’equilibri mutacional per a l’evolució de la mida del genoma.<br>The description and explanation of genetic variation within and between populations, the goal of population genetics since its origins, has been hampered by decades because of the technical inability to directly measure the genetic variation of populations. The present genome era, with the explosive growth of genome sequences fueled by the next-generation sequencing technologies, has lead us to the present golden age of the study of genetic variation at the genome scale. Population genetics is no longer an empirically insufficient science, but it is more than ever a research field where bioinformatics tools for data mining and management of large-scale dataset, statistical and evolutionary models, and advanced molecular techniques of mass generation of sequences are all them integrated in an interdisciplinary endeavor. As a consequence of this breakthrough, a new ‘omic’ discipline has emerged: Population Genomics. But, what is Population Genomics? For Charlesworth (2010), it's simply "a new term for a field of study as old as Genetics itself". It's the 'old field' of Population Genetics when studying the amount and causes of variability in natural populations in a genome-wide fashion. This thesis is both a population genomics study and a bioinformatics project centred on the visualization, description and analysis of the genome-wide DNA variation data from a natural population of model organism Drosophila melanogaster. The data used has been obtained by the international initiative The Drosophila Genetic Reference Panel (DGRP) (Mackay et al. 2012). DGRP has sequenced the complete genomes of 158 (freeze 1) and 205 (freeze 2) inbred lines of Drosophila melanogaster from a single natural population of Raleigh (USA). A major goal of this project was to create a resource of common genetic polymorphism data to further perform population genomics analyses. The DGRP sequence data has allowed us to carry out a thorough study of genome-wide variation in a natural population of D. melanogaster. After developing a complete, public and accessible map of the polymorphism present in this population, we have described the patterns of polymorphism and divergence (nucleotide and indel variants) along chromosome arms. We observe a clear and consistent pattern of genome diversity along arms of the autosomic chromosomes both for SNP and indels: diversity is reduced on average in centromeric regions relative to non-centromeric regions, and at the telomeres. This pattern is not observed in the X chromosome, where diversity is almost uniform all along the chromosome. Polymorphism and recombination are correlated along chromosome arms, but only for those regions where recombination rate is below 2cM Mb-1. Recombination rate seems to be the major force shaping the patterns of polymorphism along chromosome arms and its effect seems to be mediated by its impact on linked selection. We have mapped the footprint of natural selection on SNP and indel variants throughout the genome, observing a pervasive action of natural selection, both adaptive and purifying selection. Adaptive selection occurs preferentially in non-centromeric regions. Natural selection acts differently between insertions and deletions, being deletions more strongly selected by purifying selection, which supports the mutational equilibrium theory for genome size evolution.
APA, Harvard, Vancouver, ISO, and other styles
24

Li, Qihang. "Visual Analytics of Patterns of Gene Expression in the Developing Mammalian Brains." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492643127036346.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Martinez, Xavier. "Tracking sans marqueur de modèles physiques modulaires et articulés : vers une interface tangible pour la manipulation de simulations moléculaires." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS231/document.

Full text
Abstract:
Les modèles physiques moléculaires sont depuis longtemps utilisés dans le domaine de la biologie structurale et de la chimie. Malgré l’apparition de représentations numériques qui offrent une grande variété de visualisations moléculaires dynamiques et permettent notamment d’analyser visuellement les résultats de simulations, les modèles physiques moléculaires sont encore fréquemment utilisés. En effet, la manipulation directe et la construction manuelle de modèles physiques moléculaires facilitent l’élaboration et la mémorisation d’une représentation mentale des structures moléculaires 3D. Les techniques d’interaction avec des objets 3D n’atteignent pas encore la finesse et la richesse de perception et de manipulation des modèles physiques. Par ailleurs, l’interaction avec des représentations moléculaires virtuelles est rendue particulièrement difficile car les structures moléculaires sont très complexes du fait de leur taille, de leur caractère tridimensionnel et de leur flexibilité, auquel s'ajoutent la quantité et la variété des informations qui les caractérisent. Pour aborder la problématique de l'interaction avec ces structures moléculaires, nous proposons dans cette thèse de concevoir une interface tangible moléculaire combinant les avantages des représentations physiques et virtuelles. Pour réaliser une interface tangible flexible et modulaire, à l’image des biomolécules à manipuler, ce travail de thèse a dû relever plusieurs défis scientifiques avec pour contrainte majeure le fait de proposer une approche se passant de marqueurs et dispositif de capture 3D complexe. La première étape fut de choisir, concevoir et fabriquer un modèle physique permettant la manipulation de molécules avec de nombreux degrés de libertés. La seconde étape consistait à créer un modèle numérique permettant de reproduire le comportement mécanique du modèle physique. Enfin, il a fallu concevoir des méthodes de recalage utilisant des techniques de traitement d'image en temps réel pour que le modèle physique puisse contrôler, par couplage, son avatar virtuel. En terme de traitement d’image, de nouvelles méthodes ont été conçues implémentées et évaluées afin d'une part, d’identifier et de suivre les atomes dans l’espace image et d'autre part, d'alimenter la méthode de reconstruction 3D avec un faible nombre de points. L'une de nos contributions a été d'adapter la méthode de Structure from Motion en incluant des connaissances biochimiques pour guider la reconstruction. Par ailleurs, la visualisation conjointe de modèles physiques de molécules et de leur avatar virtuel dynamique, parfois co-localisé dans un contexte de réalité augmentée, a été abordée. Pour cela, des méthodes de visualisation haute performance adaptées à ce contexte ont été conçues afin d’améliorer la perception des formes et cavités, caractéristiques importantes des molécules biologiques. Par exemple, l’occultation ambiante ou le raycasting de sphères avec des ombres portées dynamiques permettent d’augmenter un modèle physique en tenant compte de l’illumination réelle pour une meilleure intégration en réalité augmentée. Les retombées de ce travail en terme d’usage sont nombreuses dans le domaine de la recherche et de la pédagogie en biologie moléculaire, comme dans le domaine de la conception de médicaments et plus particulièrement du Rational Drug Design. L'expert doit être au centre de la tâche de conception de médicament pour la rendre plus efficace et rationnelle, à l’image du succès du jeu sérieux Fold’It, auquel s’ajoute le bénéfice de l’utilisation d’interface tangible capable de manipuler les nombreux degrés de liberté intrinsèques des biomolécules<br>Physical molecular models have long been used in the structural biology and chemistry fields. Despite the emergence of numerical representations offering various and dynamic molecular visualizations to analyze the simulation results, molecular physical models are still being used. Direct manipulation and assembly of physical models ease to create and memorize a mental representation of 3D molecular structures. Interaction techniques to manipulate virtual 3D objects are not reaching the fineness and the benefits of the perceptual cues and manipulation skills of physical models. Moreover, interacting with virtual molecular representations remains a hard task because of the complexity of molecular structures, their size, their flexibility and the various data that define them. In this thesis, we address this issue by designing a molecular tangible interface combining the perks of physical and virtual representations. To match the flexibility and modularity of biomolecules to manipulate, this work met challenges in different scientific fields with the constraint to not use a tracker based system. The first step was to choose, conceive and build a physical model to handle the manifold degrees of freedom of molecules. The second step consisted in creating a numerical representation of mechanical properties of the physical model. Lastly, we needed to develop tracking methods using real-time image processing algorithms in order to control the virtual representation by coupling it to the physical one. New image processing methods have been implemented and evaluated to identify and track atoms in the image space. A Structure from Motion method was designed and adapted to reconstruct in 3D the atom positions by using a small amount of points and by including biochemical knowledge to guide the reconstruction. At last, we address the visualization of physical and dynamic virtual representations, sometimes co-localized in an Augmented Reality context. High performance visualization methods adapted to this context have been developed to enhance shape and cavity perception, two major specifics of biological molecules. For instance, ambient occlusion or sphere raycasting with dynamic shadows can augment a physical object taking the real illumination of the scene for a better insertion in an Augmented Reality context. The impact of this work targets both the education in molecular biology and the research field: the rational drug design field could benefit from the expertise of the user to optimize the design of drugs by manipulating biomolecule's numerous degrees of freedom using a tangible interface. Just like Fold'It is contributing to solve the folding problem, a similar approach could be used to solve the molecular docking problem using advanced manipulation interfaces
APA, Harvard, Vancouver, ISO, and other styles
26

Ayllón-Benítez, Aarón. "Development of new computational methods for a synthetic gene set annotation." Thesis, Bordeaux, 2019. http://www.theses.fr/2019BORD0305.

Full text
Abstract:
Les avancées dans l'analyse de l'expression différentielle de gènes ont suscité un vif intérêt pour l'étude d'ensembles de gènes présentant une similarité d'expression au cours d'une même condition expérimentale. Les approches classiques pour interpréter l'information biologique reposent sur l'utilisation de méthodes statistiques. Cependant, ces méthodes se focalisent sur les gènes les plus connus tout en générant des informations redondantes qui peuvent être éliminées en prenant en compte la structure des ressources de connaissances qui fournissent l'annotation. Au cours de cette thèse, nous avons exploré différentes méthodes permettant l'annotation d'ensembles de gènes.Premièrement, nous présentons les solutions visuelles développées pour faciliter l'interprétation des résultats d'annota-tion d'un ou plusieurs ensembles de gènes. Dans ce travail, nous avons développé un prototype de visualisation, appelé MOTVIS, qui explore l'annotation d'une collection d'ensembles des gènes. MOTVIS utilise ainsi une combinaison de deux vues inter-connectées : une arborescence qui fournit un aperçu global des données mais aussi des informations détaillées sur les ensembles de gènes, et une visualisation qui permet de se concentrer sur les termes d'annotation d'intérêt. La combinaison de ces deux visualisations a l'avantage de faciliter la compréhension des résultats biologiques lorsque des données complexes sont représentées.Deuxièmement, nous abordons les limitations des approches d'enrichissement statistique en proposant une méthode originale qui analyse l'impact d'utiliser différentes mesures de similarité sémantique pour annoter les ensembles de gènes. Pour évaluer l'impact de chaque mesure, nous avons considéré deux critères comme étant pertinents pour évaluer une annotation synthétique de qualité d'un ensemble de gènes : (i) le nombre de termes d'annotation doit être réduit considérablement tout en gardant un niveau suffisant de détail, et (ii) le nombre de gènes décrits par les termes sélectionnés doit être maximisé. Ainsi, neuf mesures de similarité sémantique ont été analysées pour trouver le meilleur compromis possible entre réduire le nombre de termes et maintenir un niveau suffisant de détails fournis par les termes choisis. Tout en utilisant la Gene Ontology (GO) pour annoter les ensembles de gènes, nous avons obtenu de meilleurs résultats pour les mesures de similarité sémantique basées sur les nœuds qui utilisent les attributs des termes, par rapport aux mesures basées sur les arêtes qui utilisent les relations qui connectent les termes. Enfin, nous avons développé GSAn, un serveur web basé sur les développements précédents et dédié à l'annotation d'un ensemble de gènes a priori. GSAn intègre MOTVIS comme outil de visualisation pour présenter conjointement les termes représentatifs et les gènes de l'ensemble étudié. Nous avons comparé GSAn avec des outils d'enrichissement et avons montré que les résultats de GSAn constituent un bon compromis pour maximiser la couverture de gènes tout en minimisant le nombre de termes.Le dernier point exploré est une étape visant à étudier la faisabilité d'intégrer d'autres ressources dans GSAn. Nous avons ainsi intégré deux ressources, l'une décrivant les maladies humaines avec Disease Ontology (DO) et l'autre les voies métaboliques avec Reactome. Le but était de fournir de l'information supplémentaire aux utilisateurs finaux de GSAn. Nous avons évalué l'impact de l'ajout de ces ressources dans GSAn lors de l'analyse d’ensembles de gènes. L'intégration a amélioré les résultats en couvrant d'avantage de gènes sans pour autant affecter de manière significative le nombre de termes impliqués. Ensuite, les termes GO ont été mis en correspondance avec les termes DO et Reactome, a priori et a posteriori des calculs effectués par GSAn. Nous avons montré qu'un processus de mise en correspondance appliqué a priori permettait d'obtenir un plus grand nombre d'inter-relations entre les deux ressources<br>The revolution in new sequencing technologies, by strongly improving the production of omics data, is greatly leading to new understandings of the relations between genotype and phenotype. To interpret and analyze data grouped according to a phenotype of interest, methods based on statistical enrichment became a standard in biology. However, these methods synthesize the biological information by a priori selecting the over-represented terms and focus on the most studied genes that may represent a limited coverage of annotated genes within a gene set. During this thesis, we explored different methods for annotating gene sets. In this frame, we developed three studies allowing the annotation of gene sets and thus improving the understanding of their biological context.First, visualization approaches were applied to represent annotation results provided by enrichment analysis for a gene set or a repertoire of gene sets. In this work, a visualization prototype called MOTVIS (MOdular Term VISualization) has been developed to provide an interactive representation of a repertoire of gene sets combining two visual metaphors: a treemap view that provides an overview and also displays detailed information about gene sets, and an indented tree view that can be used to focus on the annotation terms of interest. MOTVIS has the advantage to solve the limitations of each visual metaphor when used individually. This illustrates the interest of using different visual metaphors to facilitate the comprehension of biological results by representing complex data.Secondly, to address the issues of enrichment analysis, a new method for analyzing the impact of using different semantic similarity measures on gene set annotation was proposed. To evaluate the impact of each measure, two relevant criteria were considered for characterizing a "good" synthetic gene set annotation: (i) the number of annotation terms has to be drastically reduced while maintaining a sufficient level of details, and (ii) the number of genes described by the selected terms should be as large as possible. Thus, nine semantic similarity measures were analyzed to identify the best possible compromise between both criteria while maintaining a sufficient level of details. Using GO to annotate the gene sets, we observed better results with node-based measures that use the terms’ characteristics than with edge-based measures that use the relations terms. The annotation of the gene sets achieved with the node-based measures did not exhibit major differences regardless of the characteristics of the terms used. Then, we developed GSAn (Gene Set Annotation), a novel gene set annotation web server that uses semantic similarity measures to synthesize a priori GO annotation terms. GSAn contains the interactive visualization MOTVIS, dedicated to visualize the representative terms of gene set annotations. Compared to enrichment analysis tools, GSAn has shown excellent results in terms of maximizing the gene coverage while minimizing the number of terms.At last, the third work consisted in enriching the annotation results provided by GSAn. Since the knowledge described in GO may not be sufficient for interpreting gene sets, other biological information, such as pathways and diseases, may be useful to provide a wider biological context. Thus, two additional knowledge resources, being Reactome and Disease Ontology (DO), were integrated within GSAn. In practice, GO terms were mapped to terms of Reactome and DO, before and after applying the GSAn method. The integration of these resources improved the results in terms of gene coverage without affecting significantly the number of involved terms. Two strategies were applied to find mappings (generated or extracted from the web) between each new resource and GO. We have shown that a mapping process before computing the GSAn method allowed to obtain a larger number of inter-relations between the two knowledge resources
APA, Harvard, Vancouver, ISO, and other styles
27

Han, Hongqing. "Towards accurate and efficient live cell imaging data analysis." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/22324.

Full text
Abstract:
Dynamische zelluläre Prozesse wie Zellzyklus, Signaltransduktion oder Transkription zu analysieren wird Live-cell-imaging mittels Zeitraffermikroskopie verwendet. Um nun aber Zellabstammungsbäume aus einem Zeitraffervideo zu extrahieren, müssen die Zellen segmentiert und verfolgt werden können. Besonders hier, wo lebende Zellen über einen langen Zeitraum betrachtet werden, sind Fehler in der Analyse fatal: Selbst eine extrem niedrige Fehlerrate kann sich amplifizieren, wenn viele Zeitpunkte aufgenommen werden, und damit den gesamten Datensatz unbrauchbar machen. In dieser Arbeit verwenden wir einen einfachen aber praktischen Ansatz, der die Vorzüge der manuellen und automatischen Ansätze kombiniert. Das von uns entwickelte Live-cell-Imaging Datenanalysetool ‘eDetect’ ergänzt die automatische Zellsegmentierung und -verfolgung durch Nachbearbeitung. Das Besondere an dieser Arbeit ist, dass sie mehrere interaktive Datenvisualisierungsmodule verwendet, um den Benutzer zu führen und zu unterstützen. Dies erlaubt den gesamten manuellen Eingriffsprozess zu rational und effizient zu gestalten. Insbesondere werden zwei Streudiagramme und eine Heatmap verwendet, um die Merkmale einzelner Zellen interaktiv zu visualisieren. Die Streudiagramme positionieren ähnliche Objekte in unmittelbarer Nähe. So kann eine große Gruppe ähnlicher Fehler mit wenigen Mausklicks erkannt und korrigiert werden, und damit die manuellen Eingriffe auf ein Minimum reduziert werden. Die Heatmap ist darauf ausgerichtet, alle übersehenen Fehler aufzudecken und den Benutzern dabei zu helfen, bei der Zellabstammungsrekonstruktion schrittweise die perfekte Genauigkeit zu erreichen. Die quantitative Auswertung zeigt, dass eDetect die Genauigkeit der Nachverfolgung innerhalb eines akzeptablen Zeitfensters erheblich verbessern kann. Beurteilt nach biologisch relevanten Metriken, übertrifft die Leistung von eDetect die derer Tools, die den Wettbewerb ‘Cell Tracking Challenge’ gewonnen haben.<br>Live cell imaging based on time-lapse microscopy has been used to study dynamic cellular behaviors, such as cell cycle, cell signaling and transcription. Extracting cell lineage trees out of a time-lapse video requires cell segmentation and cell tracking. For long term live cell imaging, data analysis errors are particularly fatal. Even an extremely low error rate could potentially be amplified by the large number of sampled time points and render the entire video useless. In this work, we adopt a straightforward but practical design that combines the merits of manual and automatic approaches. We present a live cell imaging data analysis tool `eDetect', which uses post-editing to complement automatic segmentation and tracking. What makes this work special is that eDetect employs multiple interactive data visualization modules to guide and assist users, making the error detection and correction procedure rational and efficient. Specifically, two scatter plots and a heat map are used to interactively visualize single cells' visual features. The scatter plots position similar results in close vicinity, making it easy to spot and correct a large group of similar errors with a few mouse clicks, minimizing repetitive human interventions. The heat map is aimed at exposing all overlooked errors and helping users progressively approach perfect accuracy in cell lineage reconstruction. Quantitative evaluation proves that eDetect is able to largely improve accuracy within an acceptable time frame, and its performance surpasses the winners of most tasks in the `Cell Tracking Challenge', as measured by biologically relevant metrics.
APA, Harvard, Vancouver, ISO, and other styles
28

Marsolo, Keith Allen. "A workflow for the modeling and analysis of biomedical data." Columbus, Ohio : Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1180309265.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Royer, Loic. "Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-62562.

Full text
Abstract:
Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.
APA, Harvard, Vancouver, ISO, and other styles
30

Harvey, William John. "Understanding High-Dimensional Data Using Reeb Graphs." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1342614959.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Sequeira, José Francisco Rodrigues. "Automatic knowledge base construction from unstructured text." Master's thesis, Universidade de Aveiro, 2016. http://hdl.handle.net/10773/17910.

Full text
Abstract:
Mestrado em Engenharia de Computadores e Telemática<br>Taking into account the overwhelming number of biomedical publications being produced, the effort required for a user to efficiently explore those publications in order to establish relationships between a wide range of concepts is staggering. This dissertation presents GRACE, a web-based platform that provides an advanced graphical exploration interface that allows users to traverse the biomedical domain in order to find explicit and latent associations between annotated biomedical concepts belonging to a variety of semantic types (e.g., Genes, Proteins, Disorders, Procedures and Anatomy). The knowledge base utilized is a collection of MEDLINE articles with English abstracts. These annotations are then stored in an efficient data storage that allows for complex queries and high-performance data delivery. Concept relationship are inferred through statistical analysis, applying association measures to annotated terms. These processes grant the graphical interface the ability to create, in real-time, a data visualization in the form of a graph for the exploration of these biomedical concept relationships.<br>Tendo em conta o crescimento do número de publicações biomédicas a serem produzidas todos os anos, o esforço exigido para que um utilizador consiga, de uma forma eficiente, explorar estas publicações para conseguir estabelecer associações entre um conjunto alargado de conceitos torna esta tarefa exaustiva. Nesta disertação apresentamos uma plataforma web chamada GRACE, que providencia uma interface gráfica de exploração que permite aos utilizadores navegar pelo domínio biomédico em busca de associações explícitas ou latentes entre conceitos biomédicos pertencentes a uma variedade de domínios semânticos (i.e., Genes, Proteínas, Doenças, Procedimentos e Anatomia). A base de conhecimento usada é uma coleção de artigos MEDLINE com resumos escritos na língua inglesa. Estas anotações são armazenadas numa base de dados que permite pesquisas complexas e obtenção de dados com alta performance. As relações entre conceitos são inferidas a partir de análise estatística, aplicando medidas de associações entre os conceitos anotados. Estes processos permitem à interface gráfica criar, em tempo real, uma visualização de dados, na forma de um grafo, para a exploração destas relações entre conceitos do domínio biomédico.
APA, Harvard, Vancouver, ISO, and other styles
32

Bourqui, Romain. "Décompositions et Visualisations de graphes : applications aux données biologiques." Thesis, Bordeaux 1, 2008. http://www.theses.fr/2008BOR13630/document.

Full text
Abstract:
La quantité d’informations stockée dans les bases de données est en constante augmentation rendant ainsi nécessaire la mise au point de systémes d’analyse et de visualisation. Nous nous intéressons dans cette thèse aux données relationnelles et plus particulièrement aux données biologiques. Cette thèse s’oriente autour de trois axes principaux : tout d’abord, la décomposition de graphes en groupes d’éléments ”similaires” a?n de détecter d’éventuelles structures de communauté ; le deuxième aspect consiste à mettre en évidence ces structures dans un système de visualisation, et dans un dernier temps, nous nous intéressons à l’utilisabilité de l’un de ces systèmes de visualisation via une évaluation expérimentale. Les travaux de cette thèse ont été appliqués sur des données réelles provenant de deux domaines de la biologie : les réseaux métaboliques et les réseaux d’interactions génes- protéines<br>The amount of information stored in databases is constantly increasing making necessary to develop systems for analysis and visualization. In this thesis, we are interested in relational data and in particular, in biological data. This thesis focuses on three main axes : ?rstly, the decomposition of graph into clusters of ”similar” elements in order to detect the community structures ; the second aspect is to highlight these structures in a visualization system; and thirdly, we are interested in the usability of one of these visualization systems through an experimental evaluation. The work presented in this thesis was applied on real data from two ?elds of biology : the metabolic networks and the gene-protein interaction networks
APA, Harvard, Vancouver, ISO, and other styles
33

Cuppens, Tania. "Conséquences du contexte haplotypique sur la fonctionnalité des protéines : application à la mucoviscidose." Thesis, Brest, 2019. http://www.theses.fr/2019BRES0031/document.

Full text
Abstract:
Notre génome contient des centaines de milliers de variants génétiques, qui pour la plupart, n’ont aucun impact sur notre santé. Après séquençage, il faut les filtrer pour ne conserver que ceux qui sont potentiellement impliquées dans une maladie. On utilise des annotateurs qui prédisent l’impact des variants. Ces prédictions sont faites sans tenir compte des variants en cis dans le même gène. Pourtant, des variants neutres peuvent, lorsqu’ils sont réunis chez un individu, devenir délétères. J’ai donc développé l’outil bioinformatique GEMPROT qui permet de visualiser l’effet des variants génétiques sur la séquence protéique et de mettre en évidence les combinaisons de variants touchant un même domaine fonctionnel.J’ai ensuite étudié l’impact de deux variants associés à la p.Phe508del (508del) sur la protéine CFTR.Le variant p.Val470M est présent sur tous les haplotypes portant la délétion mais pas sur la séquence de référence, qui est généralement utilisée pour la construction de plasmides. Nous avons montré des différences de fonction de la protéine CFTR selon l’acide aminé en position 470. La fonction est augmentée avec une Valine et il convient donc de s’assurer, lors de la construction de plasmides, que le contexte haplotypique des variants étudiés est bien respecté. Le variant p.Ile1027Thr conduit à une dégradation de la fonction de la protéine 508del.Ce variant n’est présent que sur une partie des haplotypes 508del et pourrait donc avoir un effet modificateur de l’expression de la délétion. En conclusion, nous montrons l’importance de la prise en compte des contextes haplotypiques dans l’étude des maladies et proposons un outil bioinformatique pour le faire<br>We all carry hundreds of thousands genetic variations in our genome that, for the most of them, have no impact on our health. After sequencing, they must be filtered to only retain those potentially involved in a disease. We use annotators that predict the impact of variants.These predictions are done for each variant taken independently without considering cis variants in the same gene. However, neutral variants can become deleterious when associated together. I have developed the bioinformatics tool GEMPROT, which makes it possible to visualize the effect of genetic variants on the protein sequence and to highlight combinations of variants affecting the same functional domain.I then studied the impact of two variants associated with p.Phe508del (508del) on CFTR protein function.The variant p.Val470M is present on all carrying deletion haplotypes but not on the reference sequence, which is generally used for the construction of plasmids. We have shown differences in the function of the mutated CFTR protein 508del according to the amino acid at position 470. The function is increased with a Valine and it is therefore necessary to ensure, when constructing plasmids, that the haplotype context of the studied variants is well respected.The variant p.Ile1027Thr leads to a degradation of the function of the 508del protein. This variant is present only on a portion of the 508del haplotypes and could therefore have a modifying effect on deletion expression. In conclusion, we show the importance of considering haplotype contexts in the diseases studies and propose a bioinformatics tool to do so
APA, Harvard, Vancouver, ISO, and other styles
34

Abril, Ferrando Josep Francesc. "Comparative analysis of eukaryotic gene sequence features." Doctoral thesis, Universitat Pompeu Fabra, 2005. http://hdl.handle.net/10803/7108.

Full text
Abstract:
L'incessant augment del nombre de seqüències genòmiques, juntament amb <br/>l'increment del nombre de tècniques experimentals de les que es disposa, <br/>permetrà obtenir el catàleg complet de les funcions cel.lulars de <br/>diferents organismes, incloent-hi la nostra espècie. Aquest catàleg <br/>definirà els fonaments sobre els que es podrà entendre millor com els <br/>organismes funcionen a nivell molecular. Al mateix temps es tindran més <br/>pistes sobre els canvis que estan associats amb les malalties. Per tant, <br/>la seqüència en brut, tal i com s'obté dels projectes de seqüenciació de <br/>genomes, no té cap valor sense les anàlisis i la subsegüent anotació de <br/>les característiques que defineixen aquestes funcions. Aquesta tesi <br/>presenta la nostra contribució en tres aspectes relacionats de <br/>l'anotació dels gens en genomes eucariotes.<br/> <br/>Primer, la comparació a nivell de seqüència entre els genomes humà i de <br/>ratolí es va dur a terme mitjançant un protocol semi-automàtic. El <br/>programa de predicció de gens SGP2 es va desenvolupar a partir <br/>d'elements d'aquest protocol. El concepte al darrera de l'SGP2 és que <br/>les regions de similaritat obtingudes amb el programa TBLASTX, es fan <br/>servir per augmentar la puntuació dels exons predits pel programa <br/>geneid, amb el que s obtenen conjunts d'anotacions més acurats <br/>d'estructures gèniques. SGP2 té una especificitat que és prou gran com <br/>per que es puguin validar experimentalment via RT-PCR. La validació de <br/>llocs d'splicing emprant la tècnica de la RT-PCR és un bon exemple de <br/>com la combinació d'aproximacions computacionals i experimentals <br/>produeix millors resultats que per separat.<br/> <br/>S'ha dut a terme l'anàlisi descriptiva a nivell de seqüència dels llocs <br/>d'splicing obtinguts sobre un conjunt fiable de gens ortòlegs per humà, <br/>ratolí, rata i pollastre. S'han explorat les diferències a nivell de <br/>nucleòtid entre llocs U2 i U12, pel conjunt d'introns ortòlegs que se'n <br/>deriva d'aquests gens. S'ha trobat que els senyals d'splicing ortòlegs <br/>entre humà i rossegadors, així com entre rossegadors, estan més <br/>conservats que els llocs no relacionats. Aquesta conservació addicional <br/>pot ser explicada però a nivell de conservació basal dels introns. <br/>D'altra banda, s'ha detectat més conservació de l'esperada entre llocs <br/>d'splicing ortòlegs entre mamífers i pollastre. Els resultats obtinguts <br/>també indiquen que les classes intròniques U2 i U12 han evolucionat <br/>independentment des de l'ancestre comú dels mamífers i les aus. Tampoc <br/>s'ha trobat cap cas convincent d'interconversió entre aquestes dues <br/>classes en el conjunt d'introns ortòlegs generat, ni cap cas de <br/>substitució entre els subtipus AT-AC i GT-AG d'introns U12. Al contrari, <br/>el pas de GT-AG a GC-AG, i viceversa, en introns U2 no sembla ser inusual.<br/> <br/>Finalment, s'han implementat una sèrie d'eines de visualització per <br/>integrar anotacions obtingudes pels programes de predicció de gens i per <br/>les anàlisis comparatives sobre genomes. Una d'aquestes eines, el <br/>gff2ps, s'ha emprat en la cartografia dels genomes humà, de la mosca del <br/>vinagre i del mosquit de la malària, entre d'altres. El programa <br/>gff2aplot i els filtres associats, han facilitat la tasca d'integrar <br/>anotacions de seqüència amb els resultats d'eines per la cerca <br/>d'homologia, com ara el BLAST. S'ha adaptat també el concepte de <br/>pictograma a l'anàlisi comparativa de llocs d splicing ortòlegs, amb el <br/>desenvolupament del programa compi.<br>El aumento incesante del número de secuencias genómicas, junto con el <br/>incremento del número de técnicas experimentales de las que se dispone, <br/>permitirá la obtención del catálogo completo de las funciones celulares <br/>de los diferentes organismos, incluida nuestra especie. Este catálogo <br/>definirá las bases sobre las que se pueda entender mejor el <br/>funcionamiento de los organismos a nivel molecular. Al mismo tiempo, se <br/>obtendrán más pistas sobre los cambios asociados a enfermedades. Por <br/>tanto, la secuencia en bruto, tal y como se obtiene en los proyectos de <br/>secuenciación masiva, no tiene ningún valor sin los análisis y la <br/>posterior anotación de las características que definen estas funciones. <br/>Esta tesis presenta nuestra contribución a tres aspectos relacionados de <br/>la anotación de los genes en genomas eucariotas.<br/> <br/>Primero, la comparación a nivel de secuencia entre el genoma humano y el <br/>de ratón se llevó a cabo mediante un protocolo semi-automático. El <br/>programa de predicción de genes SGP2 se desarrolló a partir de elementos <br/>de dicho protocolo. El concepto sobre el que se fundamenta el SGP2 es <br/>que las regiones de similaridad obtenidas con el programa TBLASTX, se <br/>utilizan para aumentar la puntuación de los exones predichos por el <br/>programa geneid, con lo que se obtienen conjuntos más precisos de <br/>anotaciones de estructuras génicas. SGP2 tiene una especificidad <br/>suficiente como para validar esas anotaciones experimentalmente vía <br/>RT-PCR. La validación de los sitios de splicing mediante el uso de la <br/>técnica de la RT-PCR es un buen ejemplo de cómo la combinación de <br/>aproximaciones computacionales y experimentales produce mejores <br/>resultados que por separado.<br/> <br/>Se ha llevado a cabo el análisis descriptivo a nivel de secuencia de los <br/>sitios de splicing obtenidos sobre un conjunto fiable de genes ortólogos <br/>para humano, ratón, rata y pollo. Se han explorado las diferencias a <br/>nivel de nucleótido entre sitios U2 y U12 para el conjunto de intrones <br/>ortólogos derivado de esos genes. Se ha visto que las señales de <br/>splicing ortólogas entre humanos y roedores, así como entre roedores, <br/>están más conservadas que las no ortólogas. Esta conservación puede ser <br/>explicada en parte a nivel de conservación basal de los intrones. Por <br/>otro lado, se ha detectado mayor conservación de la esperada entre <br/>sitios de splicing ortólogos entre mamíferos y pollo. Los resultados <br/>obtenidos indican también que las clases intrónicas U2 y U12 han <br/>evolucionado independientemente desde el ancestro común de mamíferos y <br/>aves. Tampoco se ha hallado ningún caso convincente de interconversión <br/>entre estas dos clases en el conjunto de intrones ortólogos generado, ni <br/>ningún caso de substitución entre los subtipos AT-AC y GT-AG en intrones <br/>U12. Por el contrario, el paso de GT-AG a GC-AG, y viceversa, en <br/>intrones U2 no parece ser inusual.<br/> <br/>Finalmente, se han implementado una serie de herramientas de <br/>visualización para integrar anotaciones obtenidas por los programas de <br/>predicción de genes y por los análisis comparativos sobre genomas. Una <br/>de estas herramientas, gff2ps, se ha utilizado para cartografiar los <br/>genomas humano, de la mosca del vinagre y del mosquito de la malaria. El <br/>programa gff2aplot y los filtros asociados, han facilitado la tarea de <br/>integrar anotaciones a nivel de secuencia con los resultados obtenidos <br/>por herramientas de búsqueda de homología, como BLAST. Se ha adaptado <br/>también el concepto de pictograma al análisis comparativo de los sitios <br/>de splicing ortólogos, con el desarrollo del programa compi.<br>The constantly increasing amount of available genome sequences, along <br/>with an increasing number of experimental techniques, will help to <br/>produce the complete catalog of cellular functions for different <br/>organisms, including humans. Such a catalog will define the base from <br/>which we will better understand how organisms work at the molecular <br/>level. At the same time it will shed light on which changes are <br/>associated with disease. Therefore, the raw sequence from genome <br/>sequencing projects is worthless without the complete analysis and <br/>further annotation of the genomic features that define those functions. <br/>This dissertation presents our contribution to three related aspects of <br/>gene annotation on eukaryotic genomes.<br/> <br/>First, a comparison at sequence level of human and mouse genomes was <br/>performed by developing a semi-automatic analysis pipeline. The SGP2 <br/>gene-finding tool was developed from procedures used in this pipeline. <br/>The concept behind SGP2 is that similarity regions obtained by TBLASTX <br/>are used to increase the score of exons predicted by geneid, in order to <br/>produce a more accurate set of gene structures. SGP2 provides a <br/>specificity that is high enough for its predictions to be experimentally <br/>verified by RT-PCR. The RT-PCR validation of predicted splice junctions <br/>also serves as example of how combined computational and experimental <br/>approaches will yield the best results.<br/> <br/>Then, we performed a descriptive analysis at sequence level of the <br/>splice site signals from a reliable set of orthologous genes for human, <br/>mouse, rat and chicken. We have explored the differences at nucleotide <br/>sequence level between U2 and U12 for the set of orthologous introns <br/>derived from those genes. We found that orthologous splice signals <br/>between human and rodents and within rodents are more conserved than <br/>unrelated splice sites. However, additional conservation can be <br/>explained mostly by background intron conservation. Additional <br/>conservation over background is detectable in orthologous mammalian and <br/>chicken splice sites. Our results also indicate that the U2 and U12 <br/>intron classes have evolved independently since the split of mammals and <br/>birds. We found neither convincing case of interconversion between these <br/>two classes in our sets of orthologous introns, nor any single case of <br/>switching between AT-AC and GT-AG subtypes within U12 introns. In <br/>contrast, switching between GT-AG and GC-AG U2 subtypes does not appear <br/>to be unusual.<br/> <br/>Finally, we implemented visualization tools to integrate annotation <br/>features for gene- finding and comparative analyses. One of those tools, <br/>gff2ps, was used to draw the whole genome maps for human, fruitfly and <br/>mosquito. gff2aplot and the accompanying parsers facilitate the task of <br/>integrating sequence annotations with the output of homologybased tools, <br/>like BLAST.We have also adapted the concept of pictograms to the <br/>comparative analysis of orthologous splice sites, by developing compi.
APA, Harvard, Vancouver, ISO, and other styles
35

Block, Jeremy. "NMR Structure Improvement: A Structural Bioinformatics & Visualization Approach." Diss., 2010. http://hdl.handle.net/10161/2394.

Full text
Abstract:
<p>The overall goal of this project is to enhance the physical accuracy of individual models in macromolecular NMR (Nuclear Magnetic Resonance) structures and the realism of variation within NMR ensembles of models, while improving agreement with the experimental data. A secondary overall goal is to combine synergistically the best aspects of NMR and crystallographic methodologies to better illuminate the underlying joint molecular reality. This is accomplished by using the powerful method of all-atom contact analysis (describing detailed sterics between atoms, including hydrogens); new graphical representations and interactive tools in 3D and virtual reality; and structural bioinformatics approaches to the expanded and enhanced data now available.</p> <p> The resulting better descriptions of macromolecular structure and its dynamic variation enhances the effectiveness of the many biomedical applications that depend on detailed molecular structure, such as mutational analysis, homology modeling, molecular simulations, protein design, and drug design.</p><br>Dissertation
APA, Harvard, Vancouver, ISO, and other styles
36

"NMR Structure Improvement: A Structural Bioinformatics & Visualization Approach." Diss., 2010. http://hdl.handle.net/10161/2394.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Rohrschneider, Markus. "Visualization of Metabolic Networks." Doctoral thesis, 2014. https://ul.qucosa.de/id/qucosa%3A13145.

Full text
Abstract:
The metabolism constitutes the universe of biochemical reactions taking place in a cell of an organism. These processes include the synthesis, transformation, and degradation of molecules for an organism to grow, to reproduce and to interact with its environment. A good way to capture the complexity of these processes is the representation as metabolic network, in which sets of molecules are transformed into products by a chemical reaction, and the products are being processed further. The underlying graph model allows a structural analysis of this network using established graphtheoretical algorithms on the one hand, and a visual representation by applying layout algorithms combined with information visualization techniques on the other. In this thesis we will take a look at three different aspects of graph visualization within the context of biochemical systems: the representation and interactive exploration of static networks, the visual analysis of dynamic networks, and the comparison of two network graphs. We will demonstrate, how established infovis techniques can be combined with new algorithms and applied to specific problems in the area of metabolic network visualization. We reconstruct the metabolic network covering the complete set of chemical reactions present in a generalized eucaryotic cell from real world data available from a popular metabolic pathway data base and present a suitable data structure. As the constructed network is very large, it is not feasible for the display as a whole. Instead, we introduce a technique to analyse this static network in a top-down approach starting with an overview and displaying detailed reaction networks on demand. This exploration method is also applied to compare metabolic networks in different species and from different resources. As for the analysis of dynamic networks, we present a framework to capture changes in the connectivity as well as changes in the attributes associated with the network’s elements.
APA, Harvard, Vancouver, ISO, and other styles
38

Park, Richard Won. "Visualization and analysis of cancer genome sequencing studies." Thesis, 2014. https://hdl.handle.net/2144/15066.

Full text
Abstract:
Large-scale genomics projects such as the Cancer Genome Atlas (TCGA), and the Encyclopedia of DNA Elements (ENCODE) involve generation of data at an unprecedented scale, requiring new computational techniques for analysis and interpretation. In the three studies I present in this thesis, I utilize these data sources to derive biological insights or created visualization tools that enable others to obtain insights more easily. First, I examine the distribution of the lengths for copy number variations (CNVs) in the cancer genome. This analysis shows that a small number of genes are altered at a greater frequency than expected from a power law distribution, suggesting that a large number of genomes must be sequenced for a given tumor type to a comprehensive discovery of somatic mutations. Second, I investigate germline CNVs in thousands of TCGA samples using single nucleotide polymorphism (SNP) array data to find variants that may confer increased susceptibility to cancer. This CNV-based genome-wide association study resulted in many germline CNVs that potentially increase risk in brain, breast, colorectal, renal, or ovarian cancers. Finally, I apply several visualization techniques to create tools for the TCGA and ENCODE projects in order to help investigators better process and synthesize meaning from large volume of data. Seqeyes combines linear and circular genomic views to explore predicted structural variations to help guide experimental validation. The modEncode browser visualizes chromatin organization by integrating data from a multitude of histone marks and chromosomal proteins. These results present visualization as a useful strategy for rapid identification of salient genomic features from large, heterogeneous genomic datasets.
APA, Harvard, Vancouver, ISO, and other styles
39

"Revealing Microbial Responses to Environmental Dynamics: Developing Methods for Analysis and Visualization of Complex Sequence Datasets." Doctoral diss., 2017. http://hdl.handle.net/2286/R.I.46177.

Full text
Abstract:
abstract: The greatest barrier to understanding how life interacts with its environment is the complexity in which biology operates. In this work, I present experimental designs, analysis methods, and visualization techniques to overcome the challenges of deciphering complex biological datasets. First, I examine an iron limitation transcriptome of Synechocystis sp. PCC 6803 using a new methodology. Until now, iron limitation in experiments of Synechocystis sp. PCC 6803 gene expression has been achieved through media chelation. Notably, chelation also reduces the bioavailability of other metals, whereas naturally occurring low iron settings likely result from a lack of iron influx and not as a result of chelation. The overall metabolic trends of previous studies are well-characterized but within those trends is significant variability in single gene expression responses. I compare previous transcriptomics analyses with our protocol that limits the addition of bioavailable iron to growth media to identify consistent gene expression signals resulting from iron limitation. Second, I describe a novel method of improving the reliability of centroid-linkage clustering results. The size and complexity of modern sequencing datasets often prohibit constructing distance matrices, which prevents the use of many common clustering algorithms. Centroid-linkage circumvents the need for a distance matrix, but has the adverse effect of producing input-order dependent results. In this chapter, I describe a method of cluster edge counting across iterated centroid-linkage results and reconstructing aggregate clusters from a ranked edge list without a distance matrix and input-order dependence. Finally, I introduce dendritic heat maps, a new figure type that visualizes heat map responses through expanding and contracting sequence clustering specificities. Heat maps are useful for comparing data across a range of possible states. However, data binning is sensitive to clustering cutoffs which are often arbitrarily introduced by researchers and can substantially change the heat map response of any single data point. With an understanding of how the architectural elements of dendrograms and heat maps affect data visualization, I have integrated their salient features to create a figure type aimed at viewing multiple levels of clustering cutoffs, allowing researchers to better understand the effects of environment on metabolism or phylogenetic lineages.<br>Dissertation/Thesis<br>Chapter 2 Excel file of transcriptome responses<br>Chapter 2 Perl scripts<br>Chapter 3 Cluster Aggregation Perl script<br>Chapter 4 Example of the top-down clustering method used to construct dendritic heat maps<br>Chapter 4Perl scripts and dendritic heat map images<br>Chapter 4 Perl scripts and dendritic heat map images<br>Doctoral Dissertation Geological Sciences 2017
APA, Harvard, Vancouver, ISO, and other styles
40

Amir, El-ad David. "viSNE and Wanderlust, two algorithms for the visualization and analysis of high-dimensional single-cell data." Thesis, 2014. https://doi.org/10.7916/D8SB43VK.

Full text
Abstract:
The immune system presents a unique opportunity for studying development in mammals. White blood cells undergo differentiation and proliferation, a never-ending process throughout the life of the organism. Hematopoiesis, the development of cells in the immune system, depends upon the interaction between many different cell types (some of which comprise less than a tenth of a percent of the population), transient regulatory decisions, genomic rearrangement events, cell proliferation, and death. To capture these events we employ mass cytometry, a novel technology that measures fifty proteins simultaneously in single cells. Mass cytometry results in large quantities of high-dimensional data which challenges existing computational techniques. To address these challenges, we developed two dimensionality reduction algorithms for analyzing mass cytometry and other single-cell data. The first, viSNE, transforms high-dimensional data into an intuitive two-dimensional map, making it accessible to visual exploration. The second algorithm, Wanderlust, receives as input a static snapshot (where cells occupy different stages of their development) and constructs their developmental ordering: the developmental trajectory. viSNE maps healthy bone marrow into a canonical shape that separates cell subtypes. In leukemia, however, the shape is malformed: the maps of cancer samples are distinct from the healthy map and from each other. The algorithm highlights structure in the heterogeneity of surface phenotype expression in cancer, traverses the progression from diagnosis to relapse, and identifies a rare leukemia population in minimal residual disease settings. Wanderlust was applied to healthy B lineage cells, where the trajectory follows known marker expression trends and genetic recombination events. Using the Wanderlust trajectory we identified CD24 as an early marker of B cell development. The trajectory captures the coordination between several regulatory mechanisms (surface marker expression, signaling, proliferation and apoptosis) during crucial development checkpoints. As new technologies raise the number of simultaneously measured parameters in each cell to the hundreds, viSNE and Wanderlust will become a mainstay in analyzing and interpreting such experiments.
APA, Harvard, Vancouver, ISO, and other styles
41

Silva, Márcio Rosa da [Verfasser]. "Bioinformatics tools for the visualization and structural analysis of metabolic networks / von Márcio Rosa da Silva." 2006. http://d-nb.info/980738539/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Gomes, Miguel Dias Duarte Ferreira. "3D Visualization of very large databases - integrating and expanding the state of the art in bioinformatics and astroinformatics." Master's thesis, 2015. http://hdl.handle.net/10451/22427.

Full text
Abstract:
Tese de mestrado, Bioinformática e Biologia Computacional (Bioinformática), Universidade de Lisboa, Faculdade de Ciências, 2015<br>A exploração visual de dados é essencial para o processo científico. Muitas vezes, é o ponto de partida e até mesmo a referência de orientação para o pensamento científico. Tanto a Biologia como a Astronomia enfrentam o desafio comum da análise de grandes conjuntos de dados altamente multidimensionais. O atual estado da exploração visual de dados tabulares, muitas vezes sobre o formato de nuvens de pontos, é feito principalmente usando representações 2D. No entanto a dimensionalidade reduzida esconde facilmente características e relações nos dados. Como exemplo, a redução de dimensionalidade facilmente produz “overplotting” e vistas desorganizadas. Vários painéis 2D são muitas vezes utilizados para melhorar este problema, mas a ligação entre dados em diferentes painéis frequentemente não é clara. Estudos indicam que a redução de 3D para 2D reduz significativamente a quantidade de informação visual na análise de dados genómicos. Curiosamente, a visualização 3D não é generalizada na análise de nuvens de pontos. Esta técnica é usada quase exclusivamente no estudo de fluidos e campos, que são corpos estendidos. Uma das razões é a falta de boas ferramentas para seleção 3D e interação com grandes conjuntos de pontos. Os arquivos extremamente grandes produzidos pelos levantamentos astronómicos do presente, em conjunto com os padrões estabelecidos pelo Observatório (Astronómico) Virtual Internacional para troca de dados e interação de aplicações estão a produzir uma mudança de paradigma na forma como os dados são explorados. A tendência atual é de se deixar de fazer a exploração dos dados unicamente localmente, isto é trazendo-os para as estações de trabalho dos utilizadores, e passando-se a recorrer a serviços “on-line” para pesquisar e explorar os arquivos, quer localmente na estação de trabalho como em dispositivos móveis. O mesmo tipo de mudança de paradigma é visto nas Ciências Biológicas, onde, por exemplo, os dados genómicos são armazenados em diferentes repositórios on-line. Como tal, também se torna natural abordar a exploração moderna de dados visuais também com serviços on-line. Na verdade, isso está-se a tornar uma realidade com serviços recentes, como Rapidgraph e Plot.ly que estão a receber atenção tanto da comunidade astronómica como de outros campos. Na biologia, o Epiviz um serviço on-line projetado para visualização de dados genómicos funcionais tem recebido grande atenção ultimamente, depois de ter sido destaque na revista Nature. Neste trabalho foi desenvolvida uma aplicação web para visualização de dados, denominada SHIV, acrónimo de Simple HTML Interactive Visualizator, cuja tradução é Visualizador Interativo HTML Simples. Esta aplicação web funciona como um cliente para outra aplicação, o Object Server, um servidor de dados. O Object Server é a aplicação que irá fornecer à missão Gaia da Agência Espacial Europeia, um levantamento de 1% das estrelas da Via Láctea (ainda assim para cima de mil milhões de objetos), as funcionalidades de visualização interativa tanto em 2D como em 3D. Este trabalho, o conjunto de cliente web com a aplicação servidor, propõe-se a oferecer aos seus utilizadores uma plataforma capaz de providenciar capacidades de visualização interativa de dados de vários domínios, indo desde dados astronómicos a dados genómicos. Os utilizadores têm à sua disposição uma ferramenta acessível em qualquer plataforma, de um comum computador desktop a correr Windows a um tablet a correr Android, desde que exista uma ligação de rede e um navegador de internet razoavelmente recente é possível utilizar a aplicação. Para ultrapassar tanto as limitações associadas aos navegadores, em termos de capacidades de processamento e de armazenamento, como limitações no tratamento de grandes quantidades de dados, escolheu-se modificar um servidor de dados, principalmente astronómicos, já provado. A grande quantidade de dados a visualizar é um problema atual no domínio astronómico, que ultrapassa em muito as capacidades disponíveis nos computadores de secretária atuais, e tudo leva a crer que com a tendência de crescimento associado à Bioinformática o mesmo aconteça num futuro próximo. Para oferecer aos utilizadores de computadores normais a capacidade de visualizar o catálogo da missão Gaia, foi desenvolvido uma aplicação que fornece, entre outras, funcionalidades de níveis-de-detalhe (do inglês level-of-detail), detalhe-a-pedido (do inglês detail-on-demand) e vistas ligadas (do inglês linked-views). A conjunção de níveis-de-detalhe, a descrição de um objeto ou conjunto de objetos com sucessivos níveis de detalhe progressivamente mais complexos, com detalhe-a-pedido, a capacidade de obter só os dados relevantes a um dado campo de visão ou filtro de dados, oferece a clientes com capacidades limitadas uma visão fiel dos dados, uma visão adaptada às suas restrições, quer de resolução disponível quer de outras limitações relacionadas com a capacidade de processamento existentes. A capacidade de ligar vistas oferece aos utilizadores a possibilidade de ligar vários gráficos de uma mesma fonte de dados, por exemplo ao fazer um gráfico de dispersão de um conjunto de amostras, pode ver como é que uma dada seleção se relaciona com um histograma de expressão média. Estas capacidades, tanto para visualizações 2D como para 3D, ao serem oferecidas por uma aplicação que funciona como um serviço oferece persistência dos dados, o que significa que um utilizador pode começar uma visualização num dispositivo e terminá-la noutro. Oferece também a possibilidade de partilhar tanto os dados como visualizações já criadas com outros utilizadores. No âmbito deste trabalho várias modificações e adições tiveram que ser efetuadas na aplicação servidor, de modo a poder integra-la no domínio da Bioinformática. Foi, por exemplo, adicionada a capacidade de carregamento de ficheiros em formato FASTA ou FASTAQ assim como de ficheiros em formato GFF ou GTF, formatos comuns. Foram também melhoradas as capacidades de serviço de aplicações web, já que a aplicação original está focada em clientes nativos. Várias funcionalidades de transformação de dados, como por exemplo a capacidade de criar transpostas de uma dada tabela ou a capacidade de gerar matrizes de distâncias de amostras. O cliente foi desenvolvido com base na biblioteca D3.js de Mike Bostock, esta biblioteca oferece capacidades de produção de gráficos dinâmicos e interativos para a web, utilizando as especificações, largamente utilizadas, de HTML5, Gráficos Vetoriais Escaláveis (do inglês Scalable Vector Graphics) e Folhas de Estilo em Cascata (do inglês Cascading Style Sheets). Para o aspeto gráfico e ambiente de interação do cliente foi também utilizada a biblioteca Bootstrap, que oferece um conjunto de elementos de tipografia comuns como botões, formulários, etc., que facilitam a criação de interfaces modernas e que funcionam de maneira similar em diferentes navegadores. Para além de oferecer capacidades de visualização interativa de dados em uma ou duas dimensões, através dos muito utilizados gráficos de dispersão (scatter plot), gráficos de linhas, histogramas, Heatmaps e gráficos de blocos. A aplicação oferece também capacidades básicas de visualização de dados em três dimensões. O 3D é discutido neste trabalho porque é pouco comum ainda no domínio da Bioinformática, e no geral nas ciências biológicas, a sua utilização. Embora existam utilizações, como por exemplo a visualização da estrutura de proteínas, no resto do domínio são raras as menções da utilização do 3D para efetuar ciência e gerar conhecimento. Um possível motivo para tal é que as ferramentas atualmente existentes não contemplam a possibilidade da criação de visualizações em três dimensões. Espera-se que com a inclusão, à partida, de capacidades 3D numa aplicação que espera ser uma base de trabalho para o futuro fomente a utilização do 3D na Bioinformática. Para demonstrar as capacidades do conjunto das aplicações, são mostrados casos de uso. O primeiro, um caso de uso tipicamente astronómico, mostra como é possível efetuar a visualização dos dados da missão Hipparcos da Agência Espacial Europeia, a primeira missão focada em astrometria de precisão que efetuou medidas precisas da posição de objetos celestes, num diagrama de Hertzsprung–Russell. Este diagrama de cor-magnitude é utilizado no conhecimento da evolução estelar nos domínios da astronomia e astrofísica. Ao mesmo tempo cria-se e visualiza-se um gráfico de dispersão das posições das estrelas observadas e compara-se seleções efetuadas num dos gráficos com a sua localização no outro gráfico, fazendo uso da funcionalidade de vistas ligadas. O segundo caso de uso é um exemplo típico de bioinformática exploratória. Com o carregamento de dados de expressão genética, obtidos pelo método de Cap Analysis of Gene Expression de amostras humanas do consórcio FANTOM5. Estas 70 amostras, principalmente de tecido cerebral juntamente com alguns outliers como tecido do útero, servem como base do caso de uso. Após o carregamento dos dados cria-se e visualiza-se um gráfico MA da expressão de genética em amostras de adulto e de recém-nascido de substantia nigra. Seguidamente criam-se histogramas para a largura da expressão genética assim da expressão média dos genes. Estas visualizações demostram as capacidades interativas da aplicação. Seguidamente compara-se a largura da expressão genética com a expressão média, faz-se também uso da funcionalidade de acrescentar linhas de regressão ao gráfico para verificar a existência de tendências nos dados. Depois cria-se a matriz de distâncias das amostras que serve de base a um Heatmap onde se pode visualizar facilmente as amostras outlier. Finalmente mostra-se a utilização de gráficos em 3D para a visualizar a informação obtida no Heatmap e como também se poderia distinguir outliers com recurso à mesma. Para terminar faz-se uma discussão do trabalho e apresenta-se as áreas onde o trabalho futuro se pode focar.<br>Visual data exploration is essential to the scientific process. It is often the starting point and even the guiding reference for scientific thought. Both biology and astronomy face the common challenge of analysing large sets of highly multidimensional data. Current day visual exploration of tabular data (point clouds) is mostly done using 2D representations. But reduced dimensionality easily hides features and relations in the data. As an example, collapsing dimensions easily produces overplotting and cluttered views. Multiple 2D panels are often used to improve this problem but the link between data in different panels is frequently not clear. Studies indicate that reduction from 3D to 2D reduces significantly the quantity of visual information in the analysis of genomic data. Curiously, 3D visualisation is not widespread in the analysis of point clouds. It is almost exclusively used with fluids and fields, which are extended bodies. One of the reasons is a lack of good tools for 3D selection and interaction with large sets of point. The extremely large archives produced by today's astronomical surveys, together with the International (Astronomical) Virtual Observatory standards for data interexchange and application messaging are producing a paradigm shift in the way data is explored. The tendency is becoming not to download the data to the user’s workstation or mobile device and explore it locally, but instead to use on-line services for querying and exploring those archives. The same kind of paradigm shift is seen in the Biological Sciences where, for example, genomic data are stored in different on-line repositories. Thus, it also becomes natural to address modern visual data exploration also with on-line services. Indeed, this is becoming a reality and recent services such as Rapidgraph and Plotly are receiving attention from the astronomical community among others. In biology, the Epiviz on-line service designed for visualisation of functional genomics data has received great attention lately, having been featured by Nature. In this work a web-based interactive visualization tool, the Simple HTML Interactive Visualizator (SHIV), was developed which in conjunction with a server software, Object Server, used for providing the interactive 2D and 3D visualization infrastructure to the European Space Agency’s Gaia mission, a survey of over a billion starts or 1% of the stellar content of the Milky Way. The conjunction of a web-based client with a server software allows users, with normal computers and/or in mobile devices, to visualize the large amounts of data that are common in the Astronomy and Astrophysics fields, and are expected to happen in the near future in the Bioinformatics field if the tendency to growth holds. This capacity is made possible with the usage of features like levels-of-detail, detail-on-demand and linked views. The creation of progressively more complex levels of detail for a given object or objects (levels-of-detail), in conjunction with the possibility to just request the data associated with a given viewport or filter (detail-on-demand) allow that clients with limited resources and/or limited screen space offer to users visualizations that faithfully represent the totality of the data. Allowing users to link views, gives them the possibility to explore multiple dimensions of the same data by using several graphs to focus on specific features. The client offers common visualization tools, with the creation of scatter plots, histograms, heatmaps, linecharts and block charts in two dimensions, as well as the creation of three dimensional visualizations. It is hoped that the support for 3D since the inception of the client will provide users with the tool necessary to analyse their data in new and innovative ways.
APA, Harvard, Vancouver, ISO, and other styles
43

Luna, Augustin. "Formalization of molecular interaction maps in systems biology; Application to simulations of the relationship between DNA damage response and circadian rhythms." Thesis, 2013. https://hdl.handle.net/2144/14143.

Full text
Abstract:
Quantitative exploration of biological pathway networks must begin with a qualitative understanding of them. Often researchers aggregate and disseminate experimental data using regulatory diagrams with ad hoc notations leading to ambiguous interpretations of presented results. This thesis has two main aims. First, it develops software to allow researchers to aggregate pathway data diagrammatically using the Molecular Interaction Map (MIM) notation in order to gain a better qualitative understanding of biological systems. Secondly, it develops a quantitative biological model to study the effect of DNA damage on circadian rhythms. The second aim benefits from the first by making use of visual representations to identify potential system boundaries for the quantitative model. I focus first on software for the MIM notation - a notation to concisely visualize bioregulatory complexity and to reduce ambiguity for readers. The thesis provides a formalized MIM specification for software implementation along with a base layer of software components for the inclusion of the MIM notation in other software packages. It also provides an implementation of the specification as a user-friendly tool, PathVisio-MIM, for creating and editing MIM diagrams along with software to validate and overlay external data onto the diagrams. I focus secondly on the application of the MIM software to the quantitative exploration of the poorly understood role of SIRT1 and PARP1, two NAD+-dependent enzymes, in the regulation of circadian rhythms during DNA damage response. SIRT1 and PARP1 participate in the regulation of several key DNA damage-repair proteins and are the subjects of study as potential cancer therapeutic targets. In this part of the thesis, I present an ordinary differential equation (ODE) model that simulates the core circadian clock and the involvement of SIRT1 in both the positive and negative arms of circadian regulation. I then use this model is then used to predict a potential role for the competition for NAD+ supplies by SIRT1 and PARP1 leading to the observed behavior of primarily phase advancement of circadian oscillations during DNA damage response. The model further predicts a potential mechanism by which multiple forms of post-transcriptional modification may cooperate to produce a primarily phase advancement.
APA, Harvard, Vancouver, ISO, and other styles
44

Bazurto, Blacio Voltaire. "LivelyViz: an approach to develop interactive collaborative web visualizations." Thesis, 2016. http://hdl.handle.net/1828/7698.

Full text
Abstract:
We investigate the development of collaborative data dashboards, comprised of web visualization components. For this, we explore the use of Lively Web as a development platform and provide a framework for developing web collaborative scientific visualizations. We use a modern thin-client approach that moves most of the specific application processing logic from the client side to the server side, leveraging the implementation of reusable web services. As a web application, it provides users with multi-platform and multi-device compatibility along with enhanced concurrent access from remote locations. Our platform focuses on providing reusable, interactive, extensible and tightly- integrated web visualization components. Such visualization components are designed to be readily usable in distributed-synchronous collaborative environments. As use case we consider the development of a dashboard for researchers working with bioinformatics datasets, in particular Poxviruses data. We argue that our thin-client approach for developing web collaborative visualizations can greatly benefit researchers in different geographic locations in their mission of analyzing datasets as a team.<br>Graduate
APA, Harvard, Vancouver, ISO, and other styles
45

Royer, Loic. "Unraveling the Structure and Assessing the Quality of Protein Interaction Networks with Power Graph Analysis." Doctoral thesis, 2010. https://tud.qucosa.de/id/qucosa%3A24399.

Full text
Abstract:
Molecular biology has entered an era of systematic and automated experimentation. High-throughput techniques have moved biology from small-scale experiments focused on specific genes and proteins to genome and proteome-wide screens. One result of this endeavor is the compilation of complex networks of interacting proteins. Molecular biologists hope to understand life's complex molecular machines by studying these networks. This thesis addresses tree open problems centered upon their analysis and quality assessment. First, we introduce power graph analysis as a novel approach to the representation and visualization of biological networks. Power graphs are a graph theoretic approach to lossless and compact representation of complex networks. It groups edges into cliques and bicliques, and nodes into a neighborhood hierarchy. We demonstrate power graph analysis on five examples, and show its advantages over traditional network representations. Moreover, we evaluate the algorithm performance on a benchmark, test the robustness of the algorithm to noise, and measure its empirical time complexity at O (e1.71)- sub-quadratic in the number of edges e. Second, we tackle the difficult and controversial problem of data quality in protein interaction networks. We propose a novel measure for accuracy and completeness of genome-wide protein interaction networks based on network compressibility. We validate this new measure by i) verifying the detrimental effect of false positives and false negatives, ii) showing that gold standard networks are highly compressible, iii) showing that authors' choice of confidence thresholds is consistent with high network compressibility, iv) presenting evidence that compressibility is correlated with co-expression, co-localization and shared function, v) showing that complete and accurate networks of complex systems in other domains exhibit similar levels of compressibility than current high quality interactomes. Third, we apply power graph analysis to networks derived from text-mining as well to gene expression microarray data. In particular, we present i) the network-based analysis of genome-wide expression profiles of the neuroectodermal conversion of mesenchymal stem cells. ii) the analysis of regulatory modules in a rare mitochondrial cytopathy: emph{Mitochondrial Encephalomyopathy, Lactic acidosis, and Stroke-like episodes} (MELAS), and iii) we investigate the biochemical causes behind the enhanced biocompatibility of tantalum compared with titanium.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography