Academic literature on the topic 'Genomics workflow'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Genomics workflow.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Genomics workflow"

1

John, Aji, Kathleen Muenzen, and Kristiina Ausmees. "Evaluation of serverless computing for scalable execution of a joint variant calling workflow." PLOS ONE 16, no. 7 (2021): e0254363. http://dx.doi.org/10.1371/journal.pone.0254363.

Full text
Abstract:
Advances in whole-genome sequencing have greatly reduced the cost and time of obtaining raw genetic information, but the computational requirements of analysis remain a challenge. Serverless computing has emerged as an alternative to using dedicated compute resources, but its utility has not been widely evaluated for standardized genomic workflows. In this study, we define and execute a best-practice joint variant calling workflow using the SWEEP workflow management system. We present an analysis of performance and scalability, and discuss the utility of the serverless paradigm for executing workflows in the field of genomics research. The GATK best-practice short germline joint variant calling pipeline was implemented as a SWEEP workflow comprising 18 tasks. The workflow was executed on Illumina paired-end read samples from the European and African super populations of the 1000 Genomes project phase III. Cost and runtime increased linearly with increasing sample size, although runtime was driven primarily by a single task for larger problem sizes. Execution took a minimum of around 3 hours for 2 samples, up to nearly 13 hours for 62 samples, with costs ranging from $2 to $70.
APA, Harvard, Vancouver, ISO, and other styles
2

Diamond, Joel. "Genomics Within the Clinical Workflow." Oncology Times 40, no. 18 (2018): 13. http://dx.doi.org/10.1097/01.cot.0000546351.05658.8b.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Carey, Vincent J., Marcel Ramos, Benjamin J. Stubbs, et al. "Global Alliance for Genomics and Health Meets Bioconductor: Toward Reproducible and Agile Cancer Genomics at Cloud Scale." JCO Clinical Cancer Informatics, no. 4 (September 2020): 472–79. http://dx.doi.org/10.1200/cci.19.00111.

Full text
Abstract:
PURPOSE Institutional efforts toward the democratization of cloud-scale data and analysis methods for cancer genomics are proceeding rapidly. As part of this effort, we bridge two major bioinformatic initiatives: the Global Alliance for Genomics and Health (GA4GH) and Bioconductor. METHODS We describe in detail a use case in pancancer transcriptomics conducted by blending implementations of the GA4GH Workflow Execution Services and Tool Registry Service concepts with the Bioconductor curatedTCGAData and BiocOncoTK packages. RESULTS We carried out the analysis with a formally archived workflow and container at dockstore.org and a workspace and notebook at app.terra.bio. The analysis identified relationships between microsatellite instability and biomarkers of immune dysregulation at a finer level of granularity than previously reported. Our use of standard approaches to containerization and workflow programming allows this analysis to be replicated and extended. CONCLUSION Experimental use of dockstore.org and app.terra.bio in concert with Bioconductor enabled novel statistical analysis of large genomic projects without the need for local supercomputing resources but involved challenges related to container design, script archiving, and unit testing. Best practices and cost/benefit metrics for the management and analysis of globally federated genomic data and annotation are evolving. The creation and execution of use cases like the one reported here will be helpful in the development and comparison of approaches to federated data/analysis systems in cancer genomics.
APA, Harvard, Vancouver, ISO, and other styles
4

Khvorykh, Gennady, Andrey Khrunin, Ivan Filippenkov, Vasily Stavchansky, Lyudmila Dergunova, and Svetlana Limborska. "A Workflow for Selection of Single Nucleotide Polymorphic Markers for Studying of Genetics of Ischemic Stroke Outcomes." Genes 12, no. 3 (2021): 328. http://dx.doi.org/10.3390/genes12030328.

Full text
Abstract:
In this paper we propose a workflow for studying the genetic architecture of ischemic stroke outcomes. It develops further the candidate gene approach. The workflow is based on the animal model of brain ischemia, comparative genomics, human genomic variations, and algorithms of selection of tagging single nucleotide polymorphisms (tagSNPs) in genes which expression was changed after ischemic stroke. The workflow starts from a set of rat genes that changed their expression in response to brain ischemia and results in a set of tagSNPs, which represent other SNPs in the human genes analyzed and influenced on their expression as well.
APA, Harvard, Vancouver, ISO, and other styles
5

Kulchak Rahm, Alanna, Nephi A. Walton, Lynn K. Feldman, et al. "User testing of a diagnostic decision support system with machine-assisted chart review to facilitate clinical genomic diagnosis." BMJ Health & Care Informatics 28, no. 1 (2021): e100331. http://dx.doi.org/10.1136/bmjhci-2021-100331.

Full text
Abstract:
ObjectivesThere is a need in clinical genomics for systems that assist in clinical diagnosis, analysis of genomic information and periodic reanalysis of results, and can use information from the electronic health record to do so. Such systems should be built using the concepts of human-centred design, fit within clinical workflows and provide solutions to priority problems.MethodsWe adapted a commercially available diagnostic decision support system (DDSS) to use extracted findings from a patient record and combine them with genomic variant information in the DDSS interface. Three representative patient cases were created in a simulated clinical environment for user testing. A semistructured interview guide was created to illuminate factors relevant to human factors in CDS design and organisational implementation.ResultsSix individuals completed the user testing process. Tester responses were positive and noted good fit with real-world clinical genetics workflow. Technical issues related to interface, interaction and design were minor and fixable. Testers suggested solving issues related to terminology and usability through training and infobuttons. Time savings was estimated at 30%–50% and additional uses such as in-house clinical variant analysis were suggested for increase fit with workflow and to further address priority problems.ConclusionThis study provides preliminary evidence for usability, workflow fit, acceptability and implementation potential of a modified DDSS that includes machine-assisted chart review. Continued development and testing using principles from human-centred design and implementation science are necessary to improve technical functionality and acceptability for multiple stakeholders and organisational implementation potential to improve the genomic diagnosis process.
APA, Harvard, Vancouver, ISO, and other styles
6

Ukmar, G., G. E. M. Melloni, L. Raddrizzani, et al. "PATRI, a Genomics Data Integration Tool for Biomarker Discovery." BioMed Research International 2018 (June 28, 2018): 1–13. http://dx.doi.org/10.1155/2018/2012078.

Full text
Abstract:
The availability of genomic datasets in association with clinical, phenotypic, and drug sensitivity information represents an invaluable source for potential therapeutic applications, supporting the identification of new drug sensitivity biomarkers and pharmacological targets. Drug discovery and precision oncology can largely benefit from the integration of treatment molecular discriminants obtained from cell line models and clinical tumor samples; however this task demands comprehensive analysis approaches for the discovery of underlying data connections. Here we introduce PATRI (Platform for the Analysis of TRanslational Integrated data), a standalone tool accessible through a user-friendly graphical interface, conceived for the identification of treatment sensitivity biomarkers from user-provided genomics data, associated with information on sample characteristics. PATRI streamlines a translational analysis workflow: first, baseline genomics signatures are statistically identified, differentiating treatment sensitive from resistant preclinical models; then, these signatures are used for the prediction of treatment sensitivity in clinical samples, via random forest categorization of clinical genomics datasets and statistical evaluation of the relative phenotypic features. The same workflow can also be applied across distinct clinical datasets. The ease of use of the PATRI tool is illustrated with validation analysis examples, performed with sensitivity data for drug treatments with known molecular discriminants.
APA, Harvard, Vancouver, ISO, and other styles
7

Silva, Tiago C., Antonio Colaprico, Catharina Olsen, et al. "TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages." F1000Research 5 (June 29, 2016): 1542. http://dx.doi.org/10.12688/f1000research.8923.1.

Full text
Abstract:
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no comprehensive tool that provides a complete integrative analysis harnessing the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative downstream analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data and by using Roadmap and ENCODE data, we provide a workplan to identify candidate biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors : low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAtoolbox, TCGAbiolinks.
APA, Harvard, Vancouver, ISO, and other styles
8

Silva, Tiago C., Antonio Colaprico, Catharina Olsen, et al. "TCGA Workflow: Analyze cancer genomics and epigenomics data using Bioconductor packages." F1000Research 5 (December 28, 2016): 1542. http://dx.doi.org/10.12688/f1000research.8923.2.

Full text
Abstract:
Biotechnological advances in sequencing have led to an explosion of publicly available data via large international consortia such as The Cancer Genome Atlas (TCGA), The Encyclopedia of DNA Elements (ENCODE), and The NIH Roadmap Epigenomics Mapping Consortium (Roadmap). These projects have provided unprecedented opportunities to interrogate the epigenome of cultured cancer cell lines as well as normal and tumor tissues with high genomic resolution. The Bioconductor project offers more than 1,000 open-source software and statistical packages to analyze high-throughput genomic data. However, most packages are designed for specific data types (e.g. expression, epigenetics, genomics) and there is no one comprehensive tool that provides a complete integrative analysis of the resources and data provided by all three public projects. A need to create an integration of these different analyses was recently proposed. In this workflow, we provide a series of biologically focused integrative analyses of different molecular data. We describe how to download, process and prepare TCGA data and by harnessing several key Bioconductor packages, we describe how to extract biologically meaningful genomic and epigenomic data. Using Roadmap and ENCODE data, we provide a work plan to identify biologically relevant functional epigenomic elements associated with cancer. To illustrate our workflow, we analyzed two types of brain tumors: low-grade glioma (LGG) versus high-grade glioma (glioblastoma multiform or GBM). This workflow introduces the following Bioconductor packages: AnnotationHub, ChIPSeeker, ComplexHeatmap, pathview, ELMER, GAIA, MINET, RTCGAToolbox, TCGAbiolinks.
APA, Harvard, Vancouver, ISO, and other styles
9

Rossetto, Maurizio, Jia-Yee Samantha Yap, Jedda Lemmon, et al. "A conservation genomics workflow to guide practical management actions." Global Ecology and Conservation 26 (April 2021): e01492. http://dx.doi.org/10.1016/j.gecco.2021.e01492.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Ahmed, Azza E., Phelelani T. Mpangase, Sumir Panji, et al. "Organizing and running bioinformatics hackathons within Africa: The H3ABioNet cloud computing experience." AAS Open Research 1 (April 18, 2018): 9. http://dx.doi.org/10.12688/aasopenres.12847.1.

Full text
Abstract:
The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous compute environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. To address this need, in 2016 H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon, with the purpose of building key genomics analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on quay.io.
APA, Harvard, Vancouver, ISO, and other styles
More sources

Dissertations / Theses on the topic "Genomics workflow"

1

Puthige, Ashwin Acharya. "Bioflow: A web based workflow management system for design and execution of genomics pipelines." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/24809.

Full text
Abstract:
The cost required for the process of sequencing genomes has decreased drastically in the last few years. The knowledge of full genomes has increased the pace of the advancements in the field of functional genomics. Computational genomics, which analyses these sequences, has seen a similar growth. The multitude of sequencing technologies has resulted in various formats for storing the sequences. This has resulted in the creation of many tools for DNA analysis. There are various tools for sorting, indexing, analyzing read groups and other tasks. The analysis of genomics often requires the creation of pipelines, which processes the DNA sequences by chaining together many tools. This results in the creation of complex scripts that glue together these tools and pass the output from one stage to the other. Also, there are tools which allow creation of these pipelines with a graphical user interface. But these are complex to use and it is difficult to quickly add the new tools being developed to existing workflows. To solve these issues, we developed BioFlow; a web based genomic workflow management system. The use of BioFlow does not require any programming skills. The integrated workflow designer allows creation and saving workflows. The pipeline is created by connecting the tools with a visual connector. BioFlow provides an easy and simple interface that allows users to quickly add tools for use in any workflow. Audit logs are maintained at each stage, which helps users to easily identify errors and fix them.<br>Master of Science
APA, Harvard, Vancouver, ISO, and other styles
2

Carrión, Collado Abel Antonio. "Management of generic and multi-platform workflows for exploiting heterogeneous environments on e-Science." Doctoral thesis, Universitat Politècnica de València, 2017. http://hdl.handle.net/10251/86179.

Full text
Abstract:
Scientific Workflows (SWFs) are widely used to model applications in e-Science. In this programming model, scientific applications are described as a set of tasks that have dependencies among them. During the last decades, the execution of scientific workflows has been successfully performed in the available computing infrastructures (supercomputers, clusters and grids) using software programs called Workflow Management Systems (WMSs), which orchestrate the workload on top of these computing infrastructures. However, because each computing infrastructure has its own architecture and each scientific applications exploits efficiently one of these infrastructures, it is necessary to organize the way in which they are executed. WMSs need to get the most out of all the available computing and storage resources. Traditionally, scientific workflow applications have been extensively deployed in high-performance computing infrastructures (such as supercomputers and clusters) and grids. But, in the last years, the advent of cloud computing infrastructures has opened the door of using on-demand infrastructures to complement or even replace local infrastructures. However, new issues have arisen, such as the integration of hybrid resources or the compromise between infrastructure reutilization and elasticity, everything on the basis of cost-efficiency. The main contribution of this thesis is an ad-hoc solution for managing workflows exploiting the capabilities of cloud computing orchestrators to deploy resources on demand according to the workload and to combine heterogeneous cloud providers (such as on-premise clouds and public clouds) and traditional infrastructures (supercomputers and clusters) to minimize costs and response time. The thesis does not propose yet another WMS, but demonstrates the benefits of the integration of cloud orchestration when running complex workflows. The thesis shows several configuration experiments and multiple heterogeneous backends from a realistic comparative genomics workflow called Orthosearch, to migrate memory-intensive workload to public infrastructures while keeping other blocks of the experiment running locally. The running time and cost of the experiments is computed and best practices are suggested.<br>Los flujos de trabajo científicos son comúnmente usados para modelar aplicaciones en e-Ciencia. En este modelo de programación, las aplicaciones científicas se describen como un conjunto de tareas que tienen dependencias entre ellas. Durante las últimas décadas, la ejecución de flujos de trabajo científicos se ha llevado a cabo con éxito en las infraestructuras de computación disponibles (supercomputadores, clústers y grids) haciendo uso de programas software llamados Gestores de Flujos de Trabajos, los cuales distribuyen la carga de trabajo en estas infraestructuras de computación. Sin embargo, debido a que cada infraestructura de computación posee su propia arquitectura y cada aplicación científica explota eficientemente una de estas infraestructuras, es necesario organizar la manera en que se ejecutan. Los Gestores de Flujos de Trabajo necesitan aprovechar el máximo todos los recursos de computación y almacenamiento disponibles. Habitualmente, las aplicaciones científicas de flujos de trabajos han sido ejecutadas en recursos de computación de altas prestaciones (tales como supercomputadores y clústers) y grids. Sin embargo, en los últimos años, la aparición de las infraestructuras de computación en la nube ha posibilitado el uso de infraestructuras bajo demanda para complementar o incluso reemplazar infraestructuras locales. No obstante, este hecho plantea nuevas cuestiones, tales como la integración de recursos híbridos o el compromiso entre la reutilización de la infraestructura y la elasticidad, todo ello teniendo en cuenta que sea eficiente en el coste. La principal contribución de esta tesis es una solución ad-hoc para gestionar flujos de trabajos explotando las capacidades de los orquestadores de recursos de computación en la nube para desplegar recursos bajo demando según la carga de trabajo y combinar proveedores de computación en la nube heterogéneos (privados y públicos) e infraestructuras tradicionales (supercomputadores y clústers) para minimizar el coste y el tiempo de respuesta. La tesis no propone otro gestor de flujos de trabajo más, sino que demuestra los beneficios de la integración de la orquestación de la computación en la nube cuando se ejecutan flujos de trabajo complejos. La tesis muestra experimentos con diferentes configuraciones y múltiples plataformas heterogéneas, haciendo uso de un flujo de trabajo real de genómica comparativa llamado Orthosearch, para traspasar cargas de trabajo intensivas de memoria a infraestructuras públicas mientras se mantienen otros bloques del experimento ejecutándose localmente. El tiempo de respuesta y el coste de los experimentos son calculados, además de sugerir buenas prácticas.<br>Els fluxos de treball científics són comunament usats per a modelar aplicacions en e-Ciència. En aquest model de programació, les aplicacions científiques es descriuen com un conjunt de tasques que tenen dependències entre elles. Durant les últimes dècades, l'execució de fluxos de treball científics s'ha dut a terme amb èxit en les infraestructures de computació disponibles (supercomputadors, clústers i grids) fent ús de programari anomenat Gestors de Fluxos de Treballs, els quals distribueixen la càrrega de treball en aquestes infraestructures de computació. No obstant açò, a causa que cada infraestructura de computació posseeix la seua pròpia arquitectura i cada aplicació científica explota eficientment una d'aquestes infraestructures, és necessari organitzar la manera en què s'executen. Els Gestors de Fluxos de Treball necessiten aprofitar el màxim tots els recursos de computació i emmagatzematge disponibles. Habitualment, les aplicacions científiques de fluxos de treballs han sigut executades en recursos de computació d'altes prestacions (tals com supercomputadors i clústers) i grids. No obstant açò, en els últims anys, l'aparició de les infraestructures de computació en el núvol ha possibilitat l'ús d'infraestructures sota demanda per a complementar o fins i tot reemplaçar infraestructures locals. No obstant açò, aquest fet planteja noves qüestions, tals com la integració de recursos híbrids o el compromís entre la reutilització de la infraestructura i l'elasticitat, tot açò tenint en compte que siga eficient en el cost. La principal contribució d'aquesta tesi és una solució ad-hoc per a gestionar fluxos de treballs explotant les capacitats dels orquestadors de recursos de computació en el núvol per a desplegar recursos baix demande segons la càrrega de treball i combinar proveïdors de computació en el núvol heterogenis (privats i públics) i infraestructures tradicionals (supercomputadors i clústers) per a minimitzar el cost i el temps de resposta. La tesi no proposa un gestor de fluxos de treball més, sinó que demostra els beneficis de la integració de l'orquestració de la computació en el núvol quan s'executen fluxos de treball complexos. La tesi mostra experiments amb diferents configuracions i múltiples plataformes heterogènies, fent ús d'un flux de treball real de genòmica comparativa anomenat Orthosearch, per a traspassar càrregues de treball intensives de memòria a infraestructures públiques mentre es mantenen altres blocs de l'experiment executant-se localment. El temps de resposta i el cost dels experiments són calculats, a més de suggerir bones pràctiques.<br>Carrión Collado, AA. (2017). Management of generic and multi-platform workflows for exploiting heterogeneous environments on e-Science [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/86179<br>TESIS
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Yang, Saad M. Khan, Juexin Wang, et al. "PGen: large-scale genomic variations analysis workflow and browser in SoyKB." BIOMED CENTRAL LTD, 2016. http://hdl.handle.net/10150/624651.

Full text
Abstract:
Background: With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed " PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. Results: We have developed both a Linux version in GitHub (https:// github. com/ pegasus-isi/ PGen-GenomicVariationsWorkflow) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), (http:// soykb. org/ Pegasus/ index. php). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser (http:// soykb. org/ NGS_ Resequence/ NGS_ index. php) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. Conclusion: PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.
APA, Harvard, Vancouver, ISO, and other styles
4

Couse, Madeline Hazel. "A bioinformatic workflow for analyzing whole genomes in rare Mendelian disease." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/61332.

Full text
Abstract:
The vast majority of the human genome (~98%) is non-coding. A symphony of non-coding sequences resides in the genome, interacting with genes and the environment to tune gene expression. Functional non-coding sequences include enhancers, silencers, promoters, non-coding RNA and insulators. Variation in these non-coding sequences can cause disease, yet clinical sequencing in patients with rare Mendelian disease currently focuses mostly on variants in the ~2% of the genome that codes for protein. Indeed, variants in protein-coding genes that can explain a phenotype are identified in less than half of patients with suspected genetic disease by whole exome sequencing (WES). With the dramatic reduction in the cost of whole genome sequencing (WGS), development of algorithms to detect variants longer than 50 bp (structural variants, SVs), and improved annotation of the non-coding genome, it is now possible to interrogate the entire spectrum of genetic variation to identify a pathogenic mutation. A comprehensive pipeline is needed to analyze non-coding variation and structural variation from WGS. In this thesis, I developed and benchmarked a bioinformatics workflow to detect pathogenic non-coding SNVs/indels and pathogenic SVs, and applied this workflow to unsolved patients with rare Mendelian disorders. The pipeline detected ~80-90% of deletions, ~90% of duplications, ~65% inversions, and ~50% of insertions in a simulated genome and the NA12878 genome. The pipeline captured the majority of known pathogenic non-coding single nucleotide variant (SNVs) and insertion deletions (indels), and selectively prioritized a spiked-in known pathogenic non-coding SNV. Several interesting candidate variants were detected in patients, but none could be convincingly implicated as pathogenic. The bioinformatic workflow described in this thesis is complementary to sequencing pipelines that analyze only protein-coding variants from whole genomes. Application of this workflow to larger cohorts of patients with rare Mendelian diseases should identify pathogenic non-coding variants and SVs to increase diagnostic yield of clinical sequencing studies, assist management of genetic diseases, and contribute knowledge of novel pathogenic variants to the scientific community.<br>Science, Faculty of<br>Graduate
APA, Harvard, Vancouver, ISO, and other styles
5

Schlaffner, Christoph Norbert. "Proteogenomics for personalised molecular profiling." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275137.

Full text
Abstract:
Technological advancements in mass spectrometry allowing quantification of almost complete proteomes make proteomics a key platform for generating unique functional molecular data. Furthermore, the integrative analysis of genomic and proteomic data, termed proteogenomics, has emerged as a new field revealing insights into gene expression regulation, cell signalling, and disease processes. However, the lack of software tools for high-throughput integration and unbiased modification and variant detection hinder efforts for large-scale proteogenomics studies. The main objectives of this work are to address these issues by developing and applying new software tools and data analysis methods. Firstly, I address mapping of peptide sequences to reference genomes. I introduce a novel tool for high-throughput mapping and highlight its unique features facilitating quantitative and post-translational modification mapping alongside accounting for amino acid substitutions. The performance is benchmarked. Furthermore, I offer an additional tool that permits generation of web accessible hubs of genome wide mappings. To enable unbiased identification of post-translational modifications and amino acid substitutions for high resolution mass spectrometry data, I present algorithmic updates the mass tolerant blind spectrum comparison tool ’MS SMiV’. I demonstrate the applicability of the changes by benchmarking against a published mass tolerant database search of a high resolution tandem mass spectrometry dataset. I then present the application of ‘MS SMiV’ on a panel of 50 colorectal cancer cell lines. I show that the adaption of ‘MS SMiV’ outperforms traditional sequence database based identification of single amino acid variants. Furthermore, I highlight the utility of mass tolerant spectrum matching in combination with isobaric labelled quantitative proteomics in distinguishing between post-translational modifications and amino acid variants of similar mass. In the last part of this work I integrate both tools with a high-throughput proteogenomic identification pipeline and apply it to a pilot study of chondrocytes derived from 12 osteoarthritic individuals. I show the value of this approach in identifying variation between individuals and molecular levels and highlight them with individual examples. I show that multi-plexed proteogenomics can be used to infer genotypes of individuals.
APA, Harvard, Vancouver, ISO, and other styles
6

Cantão, Mauricio Egidio. "Abordagem algébrica para seleção de clones ótimos em projetos genomas e metagenomas." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/95/95131/tde-17092010-123112/.

Full text
Abstract:
Devido à grande diversidade de microrganismos desconhecidos no meio ambiente, 99% deles não podem ser cultivados nos meios de cultura tradicionais dos laboratórios. Para isso, projetos metagenômicos são propostos para estudar comunidades microbianas presentes no meio ambiente, a partir de técnicas moleculares, em especial o seqüenciamento. Dessa forma, para os próximos anos é esperado um acúmulo de seqüências produzidas por esses projetos. As seqüências produzidas pelos projetos genomas e metagenomas apresentam vários desafios para o tratamento, armazenamento e análise, como exemplo: a busca de clones contendo genes de interesse. Este trabalho apresenta uma abordagem algébrica que define e gerencia de forma dinâmica as regras para a seleção de clones em bibliotecas genômicas e metagenômicas, que se baseiam em álgebra de processos. Além disso, uma interface web foi desenvolvida para permitir que os pesquisadores criem e executem facilmente suas próprias regras de seleção de clones em bancos de dados de seqüências genômicas e metagenômicas. Este software foi testado em bibliotecas genômicas e metagenômicas e foi capaz de selecionar clones contendo genes de interesse.<br>Due to the wide diversity of unknown organisms in the environment, 99% of them cannot be grown in traditional culture medium in laboratories. Therefore, metagenomics projects are proposed to study microbial communities present in the environment, from molecular techniques, especially the sequencing. Thereby, for the coming years it is expected an accumulation of sequences produced by these projects. Thus, the sequences produced by genomics and metagenomics projects present several challenges for the treatment, storing and analysis such as: the search for clones containing genes of interest. This work presents an algebraic approach that defines it dynamically and manages the rules of the selection of clones in genomic and metagenomic libraries, which are based on process algebra. Furthermore, a web interface was developed to allow researchers to easily create and execute their own rules to select clones in genomic and metagenomic sequence database. This software was tested in genomics and metagenomics libraries and it was able to select clones containing genes of interest.
APA, Harvard, Vancouver, ISO, and other styles
7

Oluwaseun, Ajayi Olabode. "An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis." University of the Western Cape, 2018. http://hdl.handle.net/11394/6765.

Full text
Abstract:
>Magister Scientiae - MSc<br>Functional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs.
APA, Harvard, Vancouver, ISO, and other styles
8

Voegele, Catherine. "Development of an integrated Information Technology System for management of laboratory data and next-generation sequencing workflows within a cancer genomics research platform." Thesis, Lyon 1, 2015. http://www.theses.fr/2015LYO10095/document.

Full text
Abstract:
L'objectif de mon travail de thèse était de développer des outils bio informatiques permettant d'améliorer la traditionnelle gestion de l'information scientifique au sein d'un grand centre de recherche et en particulier au sein d'une plateforme de génomique. Trois outils ont été développés: un cahier de laboratoire électronique, un système de gestion de l'information de laboratoire pour des applications de génomique dont le séquençage de nouvelle génération, ainsi qu'un système de gestion des échantillons pour de grandes bio-banques. Ce travail a été réalisé en étroite collaboration avec des biologistes, épidémiologistes et informaticiens. Il a également inclus la mise en place d'interactions entre les différents outils pour former un système informatique intégré. Les trois outils ont été rapidement adoptés par l'ensemble des scientifiques du centre de recherche et sont désormais utilisés au quotidien pour le suivi de toutes les activités de laboratoire mais aussi plus globalement pour les autres activités scientifiques du centre de recherche. Ces outils sont transposables dans d'autres instituts de recherche<br>The aim of my thesis work was to develop bioinformatics tools to improve the traditional scientific information management within a large research centre and especially within a genomics platform. Three tools have been developed: an electronic laboratory notebook, a laboratory information management system for genomics applications including next generation sequencing, as well as a sample management system for large biobanks. This work has been conducted in close collaboration with biologists, epidemiologists and IT specialists. It has also included the setup of interactions between the different tools to make an integrated IT system. The three tools have been rapidly adopted by all the scientists of the research centre and are now daily used for the tracking of all the laboratory’s activities but also more globally for the research centre’s other scientific activities. These tools are transposable in other research institutes
APA, Harvard, Vancouver, ISO, and other styles
9

Weigl, Julia [Verfasser]. "Development of protocols and workflows for a fast gene synthesis and de novo synthesis of viral genomes : Entwicklung von Protokollen und Arbeitsabläufen für eine schnelle Gensynthese und de novo Synthese vitaler Genome / Julia Weigl." Hamburg : Staats- und Universitätsbibliothek Hamburg Carl von Ossietzky, 2018. http://d-nb.info/1223620972/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Witty, Derick. "Implementation of a Laboratory Information Management System To Manage Genomic Samples." Thesis, 2013. http://hdl.handle.net/1805/3521.

Full text
Abstract:
Indiana University-Purdue University Indianapolis (IUPUI)<br>A Laboratory Information Management Systems (LIMS) is designed to manage laboratory processes and data. It has the ability to extend the core functionality of the LIMS through configuration tools and add-on modules to support the implementation of complex laboratory workflows. The purpose of this project is to demonstrate how laboratory data and processes from a complex workflow can be implemented using a LIMS. Genomic samples have become an important part of the drug development process due to advances in molecular testing technology. This technology evaluates genomic material for disease markers and provides efficient, cost-effective, and accurate results for a growing number of clinical indications. The preparation of the genomic samples for evaluation requires a complex laboratory process called the precision aliquotting workflow. The precision aliquotting workflow processes genomic samples into precisely created aliquots for analysis. The workflow is defined by a set of aliquotting scheme attributes that are executed based on scheme specific rules logic. The aliquotting scheme defines the attributes of each aliquot based on the achieved sample recovery of the genomic sample. The scheme rules logic executes the creation of the aliquots based on the scheme definitions. LabWare LIMS is a Windows® based open architecture system that manages laboratory data and workflow processes. A LabWare LIMS model was developed to implement the precision aliquotting workflow using a combination of core functionality and configured code.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Genomics workflow"

1

Díaz, David, Sergio Gálvez, Juan Falgueras, et al. "Intuitive Bioinformatics for Genomics Applications: Omega-Brigid Workflow Framework." In Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living. Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-02481-8_164.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Strozzi, Francesco, Roel Janssen, Ricardo Wurmus, et al. "Scalable Workflows and Reproducible Data Analysis for Genomics." In Methods in Molecular Biology. Springer New York, 2019. http://dx.doi.org/10.1007/978-1-4939-9074-0_24.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Contaldi, Felice, Elisa Cappetta, and Salvatore Esposito. "Practical Workflow from High-Throughput Genotyping to Genomic Estimated Breeding Values (GEBVs)." In Methods in Molecular Biology. Springer US, 2020. http://dx.doi.org/10.1007/978-1-0716-1201-9_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Valadares, Andressa, Maria Emília Walter, and Tainá Raiol. "A Workflow for Predicting MicroRNAs Targets via Accessibility in Flavivirus Genomes." In Advances in Bioinformatics and Computational Biology. Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-01722-4_12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ocaña, Kary A. C. S., Daniel de Oliveira, Eduardo Ogasawara, Alberto M. R. Dávila, Alexandre A. B. Lima, and Marta Mattoso. "SciPhy: A Cloud-Based Workflow for Phylogenetic Analysis of Drug Targets in Protozoan Genomes." In Advances in Bioinformatics and Computational Biology. Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-22825-4_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

"TCGA Workflow Diagrams." In Collaborative Genomics Projects: A Comprehensive Guide. Elsevier, 2016. http://dx.doi.org/10.1016/b978-0-12-802143-9.00019-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Hall, David, John Miller, Jonathan Arnold, Krzysztof Kochut, Amit Sheth, and Michael Weise. "Using Workflow to Build an Information Management System for a Geographically Distributed Genome Sequencing Initiative." In Genomics of Plants and Fungi. CRC Press, 2003. http://dx.doi.org/10.1201/9780203912249.pt3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zainal-Abidin, Rabiatul-Adawiah, and Zeti-Azura Mohamed-Hussein. "Computational Analysis of Rice Transcriptomic and Genomic Datasets in Search for SNPs Involved in Flavonoid Biosynthesis." In Recent Advances in Rice Research [Working Title]. IntechOpen, 2020. http://dx.doi.org/10.5772/intechopen.94876.

Full text
Abstract:
This chapter describes the computational approach used in analyzing rice transcriptomics and genomics data to identify and annotate potential single nucleotide polymorphism (SNPs) as potential biomarker in the production of flavonoid. SNPs play a role in the accumulation of nutritional components (e.g. antioxidants), and flavonoid is one of them. However, the number of identified SNPs associated with flavonoid nutritional trait is still limited. We develop a knowledge-based bioinformatic workflow to search for specific SNPs and integration analysis on the SNPs and their co-expressed genes to investigate their influence on the gain/loss of functional genes that are involved in the production of flavonoids. Raw files obtained from the functional genomics studies can be analyzed in details to obtain a useful biological insight. Different tools, algorithms and databases are available to analyze the ontology, metabolic and pathway at the molecular level in order to observe the effects of gene and protein expression. The usage of different tools, algorithms and databases allows the integration, interpretation and the inference of analysis to provide better understanding of the biological meaning of the resutls. This chapter illustrates how to select and bring together several software to develop a specific bioinformatic workflow that processes and analyses omics data. The implementation of this bioinformatic workflow revealed the identification of potential flavonoid biosynthetic genes that can be used as guided-gene to screen the single nucleotide polymorphisms (SNPs) in the flavonoid biosynthetic genes from genome and transcriptomics data.
APA, Harvard, Vancouver, ISO, and other styles
9

Yang, Jitao. "Human Knowledge and Expertise Platform for Managing Genomics Projects." In Handbook of Research on the Role of Human Factors in IT Project Management. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-1279-1.ch009.

Full text
Abstract:
Genomics has been used to more accurately support the realization of scientific research, personalized healthcare, and precision medicine. Professional knowledge and expertise are required to process the genomics project and laboratory workflows which include project enrollment, project and sample tracking, sample warehousing, nucleic acid extraction, library construction, concentration and peak map detection, pooling, sequencing, etc. Therefore, the digitizing management of projects and laboratory workflows is very important to support the handling of tens of thousands of samples in parallel. This chapter implements the human knowledge and expertise to a project and laboratory management platform, which can integrated manage the complex genomics projects, the laboratory workflows, the sequencing instruments, the analysis pipelines, and the genetic interpretation service. The platform can automate routine tasks, facilitate communication, optimize business procedures and workflows, process tens of thousands of samples parallelly, and increase the business efficiency.
APA, Harvard, Vancouver, ISO, and other styles
10

Ugur Sezerman, Osman, Ege Ulgen, Nogayhan Seymen, and Ilknur Melis Durasi. "Bioinformatics Workflows for Genomic Variant Discovery, Interpretation and Prioritization." In Bioinformatics Tools for Detection and Clinical Interpretation of Genomic Variations. IntechOpen, 2019. http://dx.doi.org/10.5772/intechopen.85524.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Genomics workflow"

1

Pellegrino, Renata, Michael Benway, Paulina Kocjan, et al. "Abstract 5353: High-throughput automation of the 10x Genomics® Chromium™ workflow for linked-read whole exome sequencing and a targeted lynch syndrome panel." In Proceedings: AACR Annual Meeting 2017; April 1-5, 2017; Washington, DC. American Association for Cancer Research, 2017. http://dx.doi.org/10.1158/1538-7445.am2017-5353.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Pireddu, Luca, Simone Leo, and Gianluigi Zanetti. "MapReducing a genomic sequencing workflow." In the second international workshop. ACM Press, 2011. http://dx.doi.org/10.1145/1996092.1996106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Potamias, George, Lefteris Koumakis, Alexandros Kanterakis, et al. "Knowledge Discovery Scientific Workflows in Clinico-Genomics." In 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007). IEEE, 2007. http://dx.doi.org/10.1109/ictai.2007.32.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Kanwal, Sehrish, Andrew Lonie, and Richard O. Sinnott. "Digital reproducibility requirements of computational genomic workflows." In 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2017. http://dx.doi.org/10.1109/bibm.2017.8217887.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Banerjee, Subho S., Arjun P. Athreya, Liudmila S. Mainzer, et al. "Efficient and Scalable Workflows for Genomic Analyses." In HPDC'16: The 25th International Symposium on High-Performance Parallel and Distributed Computing. ACM, 2016. http://dx.doi.org/10.1145/2912152.2912156.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Yang-Turner, Fan, Lawrence Gripper, Jeremy Swann, et al. "An Open-Source Azure Solution for Scalable Genomics Workflows." In 2018 IEEE World Congress on Services (SERVICES). IEEE, 2018. http://dx.doi.org/10.1109/services.2018.00033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Rodriguez, Benjamin, Hok-Hei Tam, David Frankhouser, et al. "A scalable, flexible workflow for MethylCap-seq data analysis." In 2011 IEEE International Workshop on Genomic Signal Processing and Statistics (GENSIPS). IEEE, 2011. http://dx.doi.org/10.1109/gensips.2011.6169426.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Terra, Rafael, Micaella Coelho, Lucas Cruz, et al. "Gerência e Análises de Workflows aplicados a Redes Filogenéticas de Genomas de Dengue no Brasil." In Brazilian e-Science Workshop. Sociedade Brasileira de Computação, 2021. http://dx.doi.org/10.5753/bresci.2021.15788.

Full text
Abstract:
Processos evolutivos e dispersão de genomas de Dengue no Brasil são relevantes na direção do impacto e vigilância endemo-epidêmico e social de arboviroses emergentes. Árvores e redes filogenéticas permitem exibir eventos evolutivos e reticulados em vírus originados pela alta diversidade e taxa de mutação de recombinação homóloga frequente. Apresentamos um workflow científico paralelo e distribuído para redes filogenéticas desenhado para trabalhar com a diversidade de ferramentas e recursos em experimentos da biologia computacional e acoplados a ambientes de computação de alto desempenho. Apresentamos uma melhoria no tempo de execução de aproximadamente 5 vezes em comparação com a execução sequencial em análises de genomas de dengue e com identificação de eventos de recombinação.
APA, Harvard, Vancouver, ISO, and other styles
9

Fowler, Jerry, F. Anthony San Lucas, Smruthy Sivakumar, Aditya Deshpande, Humam Kadara, and Paul A. Scheet. "Abstract 2594: Optimizing the replication of cancer genomics workflows: case studies." In Proceedings: AACR Annual Meeting 2017; April 1-5, 2017; Washington, DC. American Association for Cancer Research, 2017. http://dx.doi.org/10.1158/1538-7445.am2017-2594.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Choudhury, Olivia, Nicholas L. Hazekamp, Douglas Thain, and Scott Emrich. "Accelerating Comparative Genomics Workflows in a Distributed Environment with Optimized Data Partitioning." In 2014 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). IEEE, 2014. http://dx.doi.org/10.1109/ccgrid.2014.79.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!