Log in

Relevant bibliographies by topics / Bioinformatics workflow / Journal articles

To see the other types of publications on this topic, follow the link: Bioinformatics workflow.

Journal articles on the topic 'Bioinformatics workflow'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Bioinformatics workflow.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Jackson, Michael, Kostas Kavoussanakis, and Edward W. J. Wallace. "Using prototyping to choose a bioinformatics workflow management system." PLOS Computational Biology 17, no. 2 (2021): e1008622. http://dx.doi.org/10.1371/journal.pcbi.1008622.

Full text

Abstract:

Workflow management systems represent, manage, and execute multistep computational analyses and offer many benefits to bioinformaticians. They provide a common language for describing analysis workflows, contributing to reproducibility and to building libraries of reusable components. They can support both incremental build and re-entrancy—the ability to selectively re-execute parts of a workflow in the presence of additional inputs or changes in configuration and to resume execution from where a workflow previously stopped. Many workflow management systems enhance portability by supporting the use of containers, high-performance computing (HPC) systems, and clouds. Most importantly, workflow management systems allow bioinformaticians to delegate how their workflows are run to the workflow management system and its developers. This frees the bioinformaticians to focus on what these workflows should do, on their data analyses, and on their science. RiboViz is a package to extract biological insight from ribosome profiling data to help advance understanding of protein synthesis. At the heart of RiboViz is an analysis workflow, implemented in a Python script. To conform to best practices for scientific computing which recommend the use of build tools to automate workflows and to reuse code instead of rewriting it, the authors reimplemented this workflow within a workflow management system. To select a workflow management system, a rapid survey of available systems was undertaken, and candidates were shortlisted: Snakemake, cwltool, Toil, and Nextflow. Each candidate was evaluated by quickly prototyping a subset of the RiboViz workflow, and Nextflow was chosen. The selection process took 10 person-days, a small cost for the assurance that Nextflow satisfied the authors’ requirements. The use of prototyping can offer a low-cost way of making a more informed selection of software to use within projects, rather than relying solely upon reviews and recommendations by others.

APA, Harvard, Vancouver, ISO, and other styles

2

Bedő, Justin. "BioShake: a Haskell EDSL for bioinformatics workflows." PeerJ 7 (July 9, 2019): e7223. http://dx.doi.org/10.7717/peerj.7223.

Full text

Abstract:

Typical bioinformatics analyses comprise of long running computational workflows. An important part of reproducible research is the management and execution of these workflows to allow robust execution and to minimise errors. BioShake is an embedded domain specific language in Haskell for specifying and executing computational workflows for bioinformatics that significantly reduces the possibility of errors occurring. Unlike other workflow frameworks, BioShake raises many properties to the type level allowing the correctness of a workflow to be statically checked during compilation, catching errors before any lengthy execution process. BioShake builds on the Shake build tool to provide robust dependency tracking, parallel execution, reporting, and resumption capabilities. Finally, BioShake abstracts execution so that jobs can either be executed directly or submitted to a cluster. BioShake is available at http://github.com/PapenfussLab/bioshake.

APA, Harvard, Vancouver, ISO, and other styles

3

Kožusznik, Jan, Petr Bainar, Jana Klímová, et al. "SPIM workflow manager for HPC." Bioinformatics 35, no. 19 (2019): 3875–76. http://dx.doi.org/10.1093/bioinformatics/btz140.

Full text

Abstract:

Abstract Summary Here we introduce a Fiji plugin utilizing the HPC-as-a-Service concept, significantly mitigating the challenges life scientists face when delegating complex data-intensive processing workflows to HPC clusters. We demonstrate on a common Selective Plane Illumination Microscopy image processing task that execution of a Fiji workflow on a remote supercomputer leads to improved turnaround time despite the data transfer overhead. The plugin allows the end users to conveniently transfer image data to remote HPC resources, manage pipeline jobs and visualize processed results directly from the Fiji graphical user interface. Availability and implementation The code is distributed free and open source under the MIT license. Source code: https://github.com/fiji-hpc/hpc-workflow-manager/, documentation: https://imagej.net/SPIM_Workflow_Manager_For_HPC. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

4

Conery, John S., Julian M. Catchen, and Michael Lynch. "Rule-based workflow management for bioinformatics." VLDB Journal 14, no. 3 (2005): 318–29. http://dx.doi.org/10.1007/s00778-005-0153-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Köster, Johannes, and Sven Rahmann. "Snakemake—a scalable bioinformatics workflow engine." Bioinformatics 34, no. 20 (2018): 3600. http://dx.doi.org/10.1093/bioinformatics/bty350.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Koster, J., and S. Rahmann. "Snakemake--a scalable bioinformatics workflow engine." Bioinformatics 28, no. 19 (2012): 2520–22. http://dx.doi.org/10.1093/bioinformatics/bts480.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Simopoulos, Caitlin M. A., Zhibin Ning, Xu Zhang, et al. "pepFunk: a tool for peptide-centric functional analysis of metaproteomic human gut microbiome studies." Bioinformatics 36, no. 14 (2020): 4171–79. http://dx.doi.org/10.1093/bioinformatics/btaa289.

Full text

Abstract:

Abstract Motivation Enzymatic digestion of proteins before mass spectrometry analysis is a key process in metaproteomic workflows. Canonical metaproteomic data processing pipelines typically involve matching spectra produced by the mass spectrometer to a theoretical spectra database, followed by matching the identified peptides back to parent-proteins. However, the nature of enzymatic digestion produces peptides that can be found in multiple proteins due to conservation or chance, presenting difficulties with protein and functional assignment. Results To combat this challenge, we developed pepFunk, a peptide-centric metaproteomic workflow focused on the analysis of human gut microbiome samples. Our workflow includes a curated peptide database annotated with Kyoto Encyclopedia of Genes and Genomes (KEGG) terms and a gene set variation analysis-inspired pathway enrichment adapted for peptide-level data. Analysis using our peptide-centric workflow is fast and highly correlated to a protein-centric analysis, and can identify more enriched KEGG pathways than analysis using protein-level data. Our workflow is open source and available as a web application or source code to be run locally. Availability and implementation pepFunk is available online as a web application at https://shiny.imetalab.ca/pepFunk/ with open-source code available from https://github.com/northomics/pepFunk. Contact dfigeys@uottawa.ca Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

8

Bhardwaj, Vivek, Steffen Heyne, Katarzyna Sikora, et al. "snakePipes: facilitating flexible, scalable and integrative epigenomic analysis." Bioinformatics 35, no. 22 (2019): 4757–59. http://dx.doi.org/10.1093/bioinformatics/btz436.

Full text

Abstract:

Abstract Summary Due to the rapidly increasing scale and diversity of epigenomic data, modular and scalable analysis workflows are of wide interest. Here we present snakePipes, a workflow package for processing and downstream analysis of data from common epigenomic assays: ChIP-seq, RNA-seq, Bisulfite-seq, ATAC-seq, Hi-C and single-cell RNA-seq. snakePipes enables users to assemble variants of each workflow and to easily install and upgrade the underlying tools, via its simple command-line wrappers and yaml files. Availability and implementation snakePipes can be installed via conda: `conda install -c mpi-ie -c bioconda -c conda-forge snakePipes’. Source code (https://github.com/maxplanck-ie/snakepipes) and documentation (https://snakepipes.readthedocs.io/en/latest/) are available online. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

9

Theil, Sebastien, and Etienne Rifa. "rANOMALY: AmplicoN wOrkflow for Microbial community AnaLYsis." F1000Research 10 (January 7, 2021): 7. http://dx.doi.org/10.12688/f1000research.27268.1.

Full text

Abstract:

Bioinformatic tools for marker gene sequencing data analysis are continuously and rapidly evolving, thus integrating most recent techniques and tools is challenging. We present an R package for data analysis of 16S and ITS amplicons based sequencing. This workflow is based on several R functions and performs automatic treatments from fastq sequence files to diversity and differential analysis with statistical validation. The main purpose of this package is to automate bioinformatic analysis, ensure reproducibility between projects, and to be flexible enough to quickly integrate new bioinformatic tools or statistical methods. rANOMALY is an easy to install and customizable R package, that uses amplicon sequence variants (ASV) level for microbial community characterization. It integrates all assets of the latest bioinformatics methods, such as better sequence tracking, decontamination from control samples, use of multiple reference databases for taxonomic annotation, all main ecological analysis for which we propose advanced statistical tests, and a cross-validated differential analysis by four different methods. Our package produces ready to publish figures, and all of its outputs are made to be integrated in Rmarkdown code to produce automated reports.

APA, Harvard, Vancouver, ISO, and other styles

10

Dijkstra, Maurits J. J., Atze J. van der Ploeg, K. Anton Feenstra, Wan J. Fokkink, Sanne Abeln, and Jaap Heringa. "Tailor-made multiple sequence alignments using the PRALINE 2 alignment toolkit." Bioinformatics 35, no. 24 (2019): 5315–17. http://dx.doi.org/10.1093/bioinformatics/btz572.

Full text

Abstract:

Abstract Summary PRALINE 2 is a toolkit for custom multiple sequence alignment workflows. It can be used to incorporate sequence annotations, such as secondary structure or (DNA) motifs, into the alignment scoring, as well as to customize many other aspects of a progressive multiple alignment workflow. Availability and implementation PRALINE 2 is implemented in Python and available as open source software on GitHub: https://github.com/ibivu/PRALINE/.

APA, Harvard, Vancouver, ISO, and other styles

11

Cheng, Gong, Quan Lu, Ling Ma, Guocai Zhang, Liang Xu, and Zongshan Zhou. "BGDMdocker: a Docker workflow for data mining and visualization of bacterial pan-genomes and biosynthetic gene clusters." PeerJ 5 (November 30, 2017): e3948. http://dx.doi.org/10.7717/peerj.3948.

Full text

Abstract:

Recently, Docker technology has received increasing attention throughout the bioinformatics community. However, its implementation has not yet been mastered by most biologists; accordingly, its application in biological research has been limited. In order to popularize this technology in the field of bioinformatics and to promote the use of publicly available bioinformatics tools, such as Dockerfiles and Images from communities, government sources, and private owners in the Docker Hub Registry and other Docker-based resources, we introduce here a complete and accurate bioinformatics workflow based on Docker. The present workflow enables analysis and visualization of pan-genomes and biosynthetic gene clusters of bacteria. This provides a new solution for bioinformatics mining of big data from various publicly available biological databases. The present step-by-step guide creates an integrative workflow through a Dockerfile to allow researchers to build their own Image and run Container easily.

APA, Harvard, Vancouver, ISO, and other styles

12

Yuen, Denis, Louise Cabansay, Andrew Duncan, et al. "The Dockstore: enhancing a community platform for sharing reproducible and accessible computational protocols." Nucleic Acids Research 49, W1 (2021): W624—W632. http://dx.doi.org/10.1093/nar/gkab346.

Full text

Abstract:

Abstract Dockstore (https://dockstore.org/) is an open source platform for publishing, sharing, and finding bioinformatics tools and workflows. The platform has facilitated large-scale biomedical research collaborations by using cloud technologies to increase the Findability, Accessibility, Interoperability and Reusability (FAIR) of computational resources, thereby promoting the reproducibility of complex bioinformatics analyses. Dockstore supports a variety of source repositories, analysis frameworks, and language technologies to provide a seamless publishing platform for authors to create a centralized catalogue of scientific software. The ready-to-use packaging of hundreds of tools and workflows, combined with the implementation of interoperability standards, enables users to launch analyses across multiple environments. Dockstore is widely used, more than twenty-five high-profile organizations share analysis collections through the platform in a variety of workflow languages, including the Broad Institute's GATK best practice and COVID-19 workflows (WDL), nf-core workflows (Nextflow), the Intergalactic Workflow Commission tools (Galaxy), and workflows from Seven Bridges (CWL) to highlight just a few. Here we describe the improvements made over the last four years, including the expansion of system integrations supporting authors, the addition of collaboration features and analysis platform integrations supporting users, and other enhancements that improve the overall scientific reproducibility of Dockstore content.

APA, Harvard, Vancouver, ISO, and other styles

13

Damkliang, Kasikrit, Pichaya Tandayya, Unitsa Sangket, and Ekawat Pasomsub. "Integrated Automatic Workflow for Phylogenetic Tree Analysis Using Public Access and Local Web Services." Journal of Integrative Bioinformatics 13, no. 1 (2016): 7–22. http://dx.doi.org/10.1515/jib-2016-287.

Full text

Abstract:

SummaryAt the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow’s execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. The all local services have been deployed at our portal http://bioservices.sci.psu.ac.th

APA, Harvard, Vancouver, ISO, and other styles

14

Emami Khoonsari, Payam, Pablo Moreno, Sven Bergmann, et al. "Interoperable and scalable data analysis with microservices: applications in metabolomics." Bioinformatics 35, no. 19 (2019): 3752–60. http://dx.doi.org/10.1093/bioinformatics/btz160.

Full text

Abstract:

Abstract Motivation Developing a robust and performant data analysis workflow that integrates all necessary components whilst still being able to scale over multiple compute nodes is a challenging task. We introduce a generic method based on the microservice architecture, where software tools are encapsulated as Docker containers that can be connected into scientific workflows and executed using the Kubernetes container orchestrator. Results We developed a Virtual Research Environment (VRE) which facilitates rapid integration of new tools and developing scalable and interoperable workflows for performing metabolomics data analysis. The environment can be launched on-demand on cloud resources and desktop computers. IT-expertise requirements on the user side are kept to a minimum, and workflows can be re-used effortlessly by any novice user. We validate our method in the field of metabolomics on two mass spectrometry, one nuclear magnetic resonance spectroscopy and one fluxomics study. We showed that the method scales dynamically with increasing availability of computational resources. We demonstrated that the method facilitates interoperability using integration of the major software suites resulting in a turn-key workflow encompassing all steps for mass-spectrometry-based metabolomics including preprocessing, statistics and identification. Microservices is a generic methodology that can serve any scientific discipline and opens up for new types of large-scale integrative science. Availability and implementation The PhenoMeNal consortium maintains a web portal (https://portal.phenomenal-h2020.eu) providing a GUI for launching the Virtual Research Environment. The GitHub repository https://github.com/phnmnl/ hosts the source code of all projects. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

15

Chao Zhou. "A Workflow Developing and Executing Environment for Bioinformatics." INTERNATIONAL JOURNAL ON Advances in Information Sciences and Service Sciences 5, no. 3 (2013): 850–57. http://dx.doi.org/10.4156/aiss.vol5.issue3.99.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Ewels, Philip, Felix Krueger, Max Käller, and Simon Andrews. "Cluster Flow: A user-friendly bioinformatics workflow tool." F1000Research 5 (December 6, 2016): 2824. http://dx.doi.org/10.12688/f1000research.10335.1.

Full text

Abstract:

Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

APA, Harvard, Vancouver, ISO, and other styles

17

Ewels, Philip, Felix Krueger, Max Käller, and Simon Andrews. "Cluster Flow: A user-friendly bioinformatics workflow tool." F1000Research 5 (May 2, 2017): 2824. http://dx.doi.org/10.12688/f1000research.10335.2.

Full text

Abstract:

Pipeline tools are becoming increasingly important within the field of bioinformatics. Using a pipeline manager to manage and run workflows comprised of multiple tools reduces workload and makes analysis results more reproducible. Existing tools require significant work to install and get running, typically needing pipeline scripts to be written from scratch before running any analysis. We present Cluster Flow, a simple and flexible bioinformatics pipeline tool designed to be quick and easy to install. Cluster Flow comes with 40 modules for common NGS processing steps, ready to work out of the box. Pipelines are assembled using these modules with a simple syntax that can be easily modified as required. Core helper functions automate many common NGS procedures, making running pipelines simple. Cluster Flow is available with an GNU GPLv3 license on GitHub. Documentation, examples and an online demo are available at http://clusterflow.io.

APA, Harvard, Vancouver, ISO, and other styles

18

Goderis, Antoon, Paul Fisher, Andrew Gibson, et al. "Benchmarking workflow discovery: a case study from bioinformatics." Concurrency and Computation: Practice and Experience 21, no. 16 (2009): 2052–69. http://dx.doi.org/10.1002/cpe.1447.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Mondelli, Maria Luiza, Thiago Magalhães, Guilherme Loss, et al. "BioWorkbench: a high-performance framework for managing and analyzing bioinformatics experiments." PeerJ 6 (August 29, 2018): e5551. http://dx.doi.org/10.7717/peerj.5551.

Full text

Abstract:

Advances in sequencing techniques have led to exponential growth in biological data, demanding the development of large-scale bioinformatics experiments. Because these experiments are computation- and data-intensive, they require high-performance computing techniques and can benefit from specialized technologies such as Scientific Workflow Management Systems and databases. In this work, we present BioWorkbench, a framework for managing and analyzing bioinformatics experiments. This framework automatically collects provenance data, including both performance data from workflow execution and data from the scientific domain of the workflow application. Provenance data can be analyzed through a web application that abstracts a set of queries to the provenance database, simplifying access to provenance information. We evaluate BioWorkbench using three case studies: SwiftPhylo, a phylogenetic tree assembly workflow; SwiftGECKO, a comparative genomics workflow; and RASflow, a RASopathy analysis workflow. We analyze each workflow from both computational and scientific domain perspectives, by using queries to a provenance and annotation database. Some of these queries are available as a pre-built feature of the BioWorkbench web application. Through the provenance data, we show that the framework is scalable and achieves high-performance, reducing up to 98% of the case studies execution time. We also show how the application of machine learning techniques can enrich the analysis process.

APA, Harvard, Vancouver, ISO, and other styles

20

Linke, Burkhard, Robert Giegerich, and Alexander Goesmann. "Conveyor: a workflow engine for bioinformatic analyses." Bioinformatics 27, no. 7 (2011): 903–11. http://dx.doi.org/10.1093/bioinformatics/btr040.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Kahsay, Robel, Jeet Vora, Rahi Navelkar, et al. "GlyGen data model and processing workflow." Bioinformatics 36, no. 12 (2020): 3941–43. http://dx.doi.org/10.1093/bioinformatics/btaa238.

Full text

Abstract:

Abstract Summary Glycoinformatics plays a major role in glycobiology research, and the development of a comprehensive glycoinformatics knowledgebase is critical. This application note describes the GlyGen data model, processing workflow and the data access interfaces featuring programmatic use case example queries based on specific biological questions. The GlyGen project is a data integration, harmonization and dissemination project for carbohydrate and glycoconjugate-related data retrieved from multiple international data sources including UniProtKB, GlyTouCan, UniCarbKB and other key resources. Availability and implementation GlyGen web portal is freely available to access at https://glygen.org. The data portal, web services, SPARQL endpoint and GitHub repository are also freely available at https://data.glygen.org, https://api.glygen.org, https://sparql.glygen.org and https://github.com/glygener, respectively. All code is released under license GNU General Public License version 3 (GNU GPLv3) and is available on GitHub https://github.com/glygener. The datasets are made available under Creative Commons Attribution 4.0 International (CC BY 4.0) license. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

22

Sarkadi, Balazs, Istvan Liko, Gabor Nyiro, Peter Igaz, Henriett Butz, and Attila Patocs. "Analytical Performance of NGS-Based Molecular Genetic Tests Used in the Diagnostic Workflow of Pheochromocytoma/Paraganglioma." Cancers 13, no. 16 (2021): 4219. http://dx.doi.org/10.3390/cancers13164219.

Full text

Abstract:

Next Generation Sequencing (NGS)-based methods are high-throughput and cost-effective molecular genetic diagnostic tools. Targeted gene panel and whole exome sequencing (WES) are applied in clinical practice for assessing mutations of pheochromocytoma/paraganglioma (PPGL) associated genes, but the best strategy is debated. Germline mutations of at the least 18 PPGL genes are present in approximately 20–40% of patients, thus molecular genetic testing is recommended in all cases. We aimed to evaluate the analytical and clinical performances of NGS methods for mutation detection of PPGL-associated genes. WES (three different library preparation and bioinformatics workflows) and an in-house, hybridization based gene panel (endocrine-onco-gene-panel- ENDOGENE) was evaluated on 37 (20 WES and 17 ENDOGENE) samples with known variants. After optimization of the bioinformatic workflow, 61 additional samples were tested prospectively. All clinically relevant variants were validated with Sanger sequencing. Target capture of PPGL genes differed markedly between WES platforms and genes tested. All known variants were correctly identified by all methods, but methods of library preparations, sequencing platforms and bioinformatical settings significantly affected the diagnostic accuracy. The ENDOGENE panel identified several pathogenic mutations and unusual genotype–phenotype associations suggesting that the whole panel should be used for identification of genetic susceptibility of PPGL.

APA, Harvard, Vancouver, ISO, and other styles

23

Puig, Oscar, Eugene Joseph, Malgorzata Jaremko, et al. "Comprehensive next generation sequencing assay and bioinformatic pipeline for identifying pathogenic variants associated with hereditary cancers." Journal of Clinical Oncology 35, no. 15_suppl (2017): e13105-e13105. http://dx.doi.org/10.1200/jco.2017.35.15_suppl.e13105.

Full text

Abstract:

e13105 Background: Diagnosis of hereditary cancer syndromes involves time-consuming comprehensive clinical and laboratory work-up, however, timely and accurate diagnosis is pivotal to the clinical management of cancer patients. Germline genetic testing has shown to facilitate the diagnostic process, allowing for identification and management of individuals at risk for inherited cancers. However, the laboratory diagnostics process requires not only development and validation of comprehensive gene panels to improve diagnostic yields, but a quality driven workflow including an end-to-end bioinformatics pipeline, and a robust process for variant classification. We will present a gene panel for the evaluation of hereditary cancer syndromes, conducted utilizing our novel end-to-end workflow, and validated in the CLIA-approved environment. Methods: A targeted Next-Generation Sequencing (NGS) panel consisting of 130 genes, including exons, promoters, 5’-UTRs, 3’-UTRs and selected introns, was designed to include genes associated with hereditary cancers. The assay was validated using samples from the 1000 genomes project and samples with known pathogenic variants. Elements software was utilized for end-to-end bioinformatic process ensuring adherence with the CLIA quality standards, and supporting manual curation of sequence variants. Results: Preliminary data from our current panel of genes associated with hereditary cancer syndromes revealed high sensitivity, specificity, and positive predictive value. Accuracy was confirmed by analysis of known SNVs, indels, and CNVs using 1000 Genomes and samples carrying pathogenic variants. The bioinformatics software allowed for an end-to-end quality controlled process of handling and analyzing of the NGS data, showing applicability for a clinical laboratory workflow. Conclusions: We have developed a comprehensive and accurate genetic testing process based on an automated and quality driven bioinformatics workflow that can be used to identify clinically important variants in genes associated with hereditary cancers. It's performance allows for implementation in the clinical laboratory setting.

APA, Harvard, Vancouver, ISO, and other styles

24

Thi Nhung, Doan, and Bui Van Ngoc. "Bioinformatic approaches for analysis of coral-associated bacteria using R programming language." Vietnam Journal of Biotechnology 18, no. 4 (2021): 733–43. http://dx.doi.org/10.15625/1811-4989/18/4/15320.

Full text

Abstract:

Recent advances in metagenomics and bioinformatics allow the robust analysis of the composition and abundance of microbial communities, functional genes, and their metabolic pathways. So far, there has been a variety of computational/statistical tools or software for analyzing microbiome, the common problems that occurred in its implementation are, however, the lack of synchronization and compatibility of output/input data formats between such software. To overcome these challenges, in this study context, we aim to apply the DADA2 pipeline (written in R programming language) instead of using a set of different bioinformatics tools to create our own workflow for microbial community analysis in a continuous and synchronous manner. For the first effort, we tried to investigate the composition and abundance of coral-associated bacteria using their 16S rRNA gene amplicon sequences. The workflow or framework includes the following steps: data processing, sequence clustering, taxonomic assignment, and data visualization. Moreover, we also like to catch readers’ attention to the information about bacterial communities living in the ocean as most marine microorganisms are unculturable, especially residing in coral reefs, namely, bacteria are associated with the coral Acropora tenuis in this case. The outcomes obtained in this study suggest that the DADA2 pipeline written in R programming language is one of the potential bioinformatics approaches in the context of microbiome analysis other than using various software. Besides, our modifications for the workflow execution help researchers to illustrate metagenomic data more easily and systematically, elucidate the composition, abundance, diversity, and relationship between microorganism communities as well as to develop other bioinformatic tools more effectively.

APA, Harvard, Vancouver, ISO, and other styles

25

Lee, Michael D. "GToTree: a user-friendly workflow for phylogenomics." Bioinformatics 35, no. 20 (2019): 4162–64. http://dx.doi.org/10.1093/bioinformatics/btz188.

Full text

Abstract:

Abstract Summary Genome-level evolutionary inference (i.e. phylogenomics) is becoming an increasingly essential step in many biologists’ work. Accordingly, there are several tools available for the major steps in a phylogenomics workflow. But for the biologist whose main focus is not bioinformatics, much of the computational work required—such as accessing genomic data on large scales, integrating genomes from different file formats, performing required filtering, stitching different tools together etc.—can be prohibitive. Here I introduce GToTree, a command-line tool that can take any combination of fasta files, GenBank files and/or NCBI assembly accessions as input and outputs an alignment file, estimates of genome completeness and redundancy, and a phylogenomic tree based on a specified single-copy gene (SCG) set. Although GToTree can work with any custom hidden Markov Models (HMMs), also included are 13 newly generated SCG-set HMMs for different lineages and levels of resolution, built based on searches of ∼12 000 bacterial and archaeal high-quality genomes. GToTree aims to give more researchers the capability to make phylogenomic trees. Availability and implementation GToTree is open-source and freely available for download from: github.com/AstrobioMike/GToTree. It is implemented primarily in bash with helper scripts written in python. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

26

Piras, Marco Enrico, Luca Pireddu, and Gianluigi Zanetti. "wft4galaxy: a workflow testing tool for galaxy." Bioinformatics 33, no. 23 (2017): 3805–7. http://dx.doi.org/10.1093/bioinformatics/btx461.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Wercelens, Polyane, Waldeyr da Silva, Fernanda Hondo, et al. "Bioinformatics Workflows With NoSQL Database in Cloud Computing." Evolutionary Bioinformatics 15 (January 2019): 117693431988997. http://dx.doi.org/10.1177/1176934319889974.

Full text

Abstract:

Scientific workflows can be understood as arrangements of managed activities executed by different processing entities. It is a regular Bioinformatics approach applying workflows to solve problems in Molecular Biology, notably those related to sequence analyses. Due to the nature of the raw data and the in silico environment of Molecular Biology experiments, apart from the research subject, 2 practical and closely related problems have been studied: reproducibility and computational environment. When aiming to enhance the reproducibility of Bioinformatics experiments, various aspects should be considered. The reproducibility requirements comprise the data provenance, which enables the acquisition of knowledge about the trajectory of data over a defined workflow, the settings of the programs, and the entire computational environment. Cloud computing is a booming alternative that can provide this computational environment, hiding technical details, and delivering a more affordable, accessible, and configurable on-demand environment for researchers. Considering this specific scenario, we proposed a solution to improve the reproducibility of Bioinformatics workflows in a cloud computing environment using both Infrastructure as a Service (IaaS) and Not only SQL (NoSQL) database systems. To meet the goal, we have built 3 typical Bioinformatics workflows and ran them on 1 private and 2 public clouds, using different types of NoSQL database systems to persist the provenance data according to the Provenance Data Model (PROV-DM). We present here the results and a guide for the deployment of a cloud environment for Bioinformatics exploring the characteristics of various NoSQL database systems to persist provenance data.

APA, Harvard, Vancouver, ISO, and other styles

28

Sohn, Bong-Ki, Keon-Myung Lee, and Hak-Joon Kim. "A Multiagent System for Workflow-Based Bioinformatics Tool Integration." International Journal of Fuzzy Logic and Intelligent Systems 3, no. 2 (2003): 133–37. http://dx.doi.org/10.5391/ijfis.2003.3.2.133.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Plesniewicz, Gerald, and Baurzhan Karabekov. "Specifying temporal knowledge for workflows ontologies." Open Computer Science 6, no. 1 (2016): 226–31. http://dx.doi.org/10.1515/comp-2016-0020.

Full text

Abstract:

AbstractA workflow is an automation of a process, in which participants (people or programs) are involved in activities for solving a set of tasks according to certain rules and constraints in order to attain a common goal. The concept of workflow appeared in business informatics. Currently the workflow techniques are used in many other fields such as medical informatics, bioinformatics, automation of scientific research, computer-aided design and manufacturing, etc. An ontology is a formal description (in terms of concepts, entities, their properties and relationships) of knowledge for solving a given class of problems. In particular, ontologies can be used in problems related to workflows. In this paper, we introduce a formalism that extends the language of the interval Allen’s logic, and show how this formalism can be applied to specify temporal knowledge in ontologies for workflows. For the extended Allen’s logic, we construct a deduction system based on the analytic tableaux method. We also show (by examples) how to apply the deduction method to query answering over ontologies written in the extended Allen’s logic.

APA, Harvard, Vancouver, ISO, and other styles

30

Chakroborti, Debasish, Banani Roy, and Sristy Sumana Nath. "Designing for Recommending Intermediate States in A Scientific Workflow Management System." Proceedings of the ACM on Human-Computer Interaction 5, EICS (2021): 1–29. http://dx.doi.org/10.1145/3457145.

Full text

Abstract:

To process a large amount of data sequentially and systematically, proper management of workflow components (i.e., modules, data, configurations, associations among ports and links) in a Scientific Workflow Management System (SWfMS) is inevitable. Managing data with provenance in a SWfMS to support reusability of workflows, modules, and data is not a simple task. Handling such components is even more burdensome for frequently assembled and executed complex workflows for investigating large datasets with different technologies (i.e., various learning algorithms or models). However, a great many studies propose various techniques and technologies for managing and recommending services in a SWfMS, but only a very few studies consider the management of data in a SWfMS for efficient storing and facilitating workflow executions. Furthermore, there is no study to inquire about the effectiveness and efficiency of such data management in a SWfMS from a user perspective. In this paper, we present and evaluate a GUI version of such a novel approach of intermediate data management with two use cases (Plant Phenotyping and Bioinformatics). The technique we call GUI-RISPTS (Recommending Intermediate States from Pipelines Considering Tool-States) can facilitate executions of workflows with processed data (i.e., intermediate outcomes of modules in a workflow) and can thus reduce the computational time of some modules in a SWfMS. We integrated GUI-RISPTS with an existing workflow management system called SciWorCS. In SciWorCS, we present an interface that users use for selecting the recommendation of intermediate states (i.e., modules' outcomes). We investigated GUI-RISPTS's effectiveness from users' perspectives along with measuring its overhead in terms of storage and efficiency in workflow execution.

APA, Harvard, Vancouver, ISO, and other styles

31

Palmblad, Magnus, Anna-Lena Lamprecht, Jon Ison, and Veit Schwämmle. "Automated workflow composition in mass spectrometry-based proteomics." Bioinformatics 35, no. 4 (2018): 656–64. http://dx.doi.org/10.1093/bioinformatics/bty646.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Phillips, Jason R., Daniel L. Svoboda, Arpit Tandon, et al. "BMDExpress 2: enhanced transcriptomic dose-response analysis workflow." Bioinformatics 35, no. 10 (2018): 1780–82. http://dx.doi.org/10.1093/bioinformatics/bty878.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Claeys, M., V. Storms, H. Sun, T. Michoel, and K. Marchal. "MotifSuite: workflow for probabilistic motif detection and assessment." Bioinformatics 28, no. 14 (2012): 1931–32. http://dx.doi.org/10.1093/bioinformatics/bts293.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Mariette, Jérôme, Frédéric Escudié, Philippe Bardou, et al. "Jflow: a workflow management system for web applications." Bioinformatics 32, no. 3 (2015): 456–58. http://dx.doi.org/10.1093/bioinformatics/btv589.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Sztromwasser, Paweł, Kjell Petersen, and Pál Puntervoll. "Data partitioning enables the use of standard SOAP Web Services in genome-scale workflows." Journal of Integrative Bioinformatics 8, no. 2 (2011): 95–114. http://dx.doi.org/10.1515/jib-2011-163.

Full text

Abstract:

Summary Biological databases and computational biology tools are provided by research groups around the world, and made accessible on the Web. Combining these resources is a common practice in bioinformatics, but integration of heterogeneous and often distributed tools and datasets can be challenging. To date, this challenge has been commonly addressed in a pragmatic way, by tedious and error-prone scripting. Recently however a more reliable technique has been identified and proposed as the platform that would tie together bioinformatics resources, namely Web Services. In the last decade the Web Services have spread wide in bioinformatics, and earned the title of recommended technology. However, in the era of high-throughput experimentation, a major concern regarding Web Services is their ability to handle large-scale data traffic. We propose a stream-like communication pattern for standard SOAP Web Services, that enables efficient flow of large data traffic between a workflow orchestrator and Web Services. We evaluated the data-partitioning strategy by comparing it with typical communication patterns on an example pipeline for genomic sequence annotation. The results show that data-partitioning lowers resource demands of services and increases their throughput, which in consequence allows to execute in-silico experiments on genome-scale, using standard SOAP Web Services and workflows. As a proof-of-principle we annotated an RNA-seq dataset using a plain BPEL workflow engine.

APA, Harvard, Vancouver, ISO, and other styles

36

Ahmed, Azza E., Phelelani T. Mpangase, Sumir Panji, et al. "Organizing and running bioinformatics hackathons within Africa: The H3ABioNet cloud computing experience." AAS Open Research 1 (April 18, 2018): 9. http://dx.doi.org/10.12688/aasopenres.12847.1.

Full text

Abstract:

The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous compute environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. To address this need, in 2016 H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon, with the purpose of building key genomics analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on quay.io.

APA, Harvard, Vancouver, ISO, and other styles

37

Ma, Xiaoxia, Yijun Meng, Pu Wang, Zhonghai Tang, Huizhong Wang, and Tian Xie. "Bioinformatics-assisted, integrated omics studies on medicinal plants." Briefings in Bioinformatics 21, no. 6 (2019): 1857–74. http://dx.doi.org/10.1093/bib/bbz132.

Full text

Abstract:

Abstract The immense therapeutic and economic values of medicinal plants have attracted increasing attention from the worldwide researchers. It has been recognized that production of the authentic and high-quality herbal drugs became the prerequisite for maintaining the healthy development of the traditional medicine industry. To this end, intensive research efforts have been devoted to the basic studies, in order to pave a way for standardized authentication of the plant materials, and bioengineering of the metabolic pathways in the medicinal plants. In this paper, the recent advances of omics studies on the medicinal plants were summarized from several aspects, including phenomics and taxonomics, genomics, transcriptomics, proteomics and metabolomics. We proposed a multi-omics data-based workflow for medicinal plant research. It was emphasized that integration of the omics data was important for plant authentication and mechanistic studies on plant metabolism. Additionally, the computational tools for proper storage, efficient processing and high-throughput analyses of the omics data have been introduced into the workflow. According to the workflow, authentication of the medicinal plant materials should not only be performed at the phenomics level but also be implemented by genomic and metabolomic marker-based examination. On the other hand, functional genomics studies, transcriptional regulatory networks and protein–protein interactions will contribute greatly for deciphering the secondary metabolic pathways. Finally, we hope that our work could inspire further efforts on the bioinformatics-assisted, integrated omics studies on the medicinal plants.

APA, Harvard, Vancouver, ISO, and other styles

38

Huang, Yu, Jian Yong Wang, Xiao Mei Wei, and Bin Hu. "Bioinfo-Kit: A Sharing Software Tool for Bioinformatics." Applied Mechanics and Materials 472 (January 2014): 466–69. http://dx.doi.org/10.4028/www.scientific.net/amm.472.466.

Full text

Abstract:

In bioinformatics research, for the lacks of an effective algorithm-integrating mechanism and a friendly graphical user interface of the toolkits in the field of biological data processing and analyzing, the authors designed and implemented a sharing software system Bioinfo-Kit based on the Java 2 Platform Enterprise Edition (J2EE), which provides 1) a general application development interface to integrate or bridge other programs and 2) a workflow mechanism to operate them and make them talk easily. In addition, a module for biological data (multiple electrodes data, biomedical data and etc.) analysis was implemented in the software system, which provided a workflow mechanism to integrate a series of algorithms and visualization tools. Because the interface developed is very general and flexible, new analysis tools can be integrated effectively as required. Bioinfo-Kit implies an ideal environment for integrative bioinformatics research.

APA, Harvard, Vancouver, ISO, and other styles

39

Anslan, Sten, R. Henrik Nilsson, Christian Wurzbacher, Petr Baldrian, Leho Tedersoo, and Mohammad Bahram. "Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding." MycoKeys 39 (September 10, 2018): 29–40. http://dx.doi.org/10.3897/mycokeys.39.28109.

Full text

Abstract:

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.

APA, Harvard, Vancouver, ISO, and other styles

40

Ahmed, Azza E., Phelelani T. Mpangase, Sumir Panji, et al. "Organizing and running bioinformatics hackathons within Africa: The H3ABioNet cloud computing experience." AAS Open Research 1 (August 7, 2019): 9. http://dx.doi.org/10.12688/aasopenres.12847.2.

Full text

Abstract:

The need for portable and reproducible genomics analysis pipelines is growing globally as well as in Africa, especially with the growth of collaborative projects like the Human Health and Heredity in Africa Consortium (H3Africa). The Pan-African H3Africa Bioinformatics Network (H3ABioNet) recognized the need for portable, reproducible pipelines adapted to heterogeneous computing environments, and for the nurturing of technical expertise in workflow languages and containerization technologies. Building on the network’s Standard Operating Procedures (SOPs) for common genomic analyses, H3ABioNet arranged its first Cloud Computing and Reproducible Workflows Hackathon in 2016, with the purpose of translating those SOPs into analysis pipelines able to run on heterogeneous computing environments and meeting the needs of H3Africa research projects. This paper describes the preparations for this hackathon and reflects upon the lessons learned about its impact on building the technical and scientific expertise of African researchers. The workflows developed were made publicly available in GitHub repositories and deposited as container images on Quay.io.

APA, Harvard, Vancouver, ISO, and other styles

41

Jeong, E., M. Nagasaki, E. Ikeda, Y. Sekiya, A. Saito, and S. Miyano. "CSO validator: improving manual curation workflow for biological pathways." Bioinformatics 27, no. 17 (2011): 2471–72. http://dx.doi.org/10.1093/bioinformatics/btr395.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Kano, Yoshinobu, Paul Dobson, Mio Nakanishi, Jun'ichi Tsujii, and Sophia Ananiadou. "Text mining meets workflow: linking U-Compare with Taverna." Bioinformatics 26, no. 19 (2010): 2486–87. http://dx.doi.org/10.1093/bioinformatics/btq464.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Gouet, P., and E. Courcelle. "ENDscript: a workflow to display sequence and structure information." Bioinformatics 18, no. 5 (2002): 767–68. http://dx.doi.org/10.1093/bioinformatics/18.5.767.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Peleg, M., I. Yeh, and R. B. Altman. "Modelling biological processes using workflow and Petri Net models." Bioinformatics 18, no. 6 (2002): 825–37. http://dx.doi.org/10.1093/bioinformatics/18.6.825.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Wrede, Fredrik, and Andreas Hellander. "Smart computational exploration of stochastic gene regulatory network models using human-in-the-loop semi-supervised learning." Bioinformatics 35, no. 24 (2019): 5199–206. http://dx.doi.org/10.1093/bioinformatics/btz420.

Full text

Abstract:

Abstract Motivation Discrete stochastic models of gene regulatory network models are indispensable tools for biological inquiry since they allow the modeler to predict how molecular interactions give rise to nonlinear system output. Model exploration with the objective of generating qualitative hypotheses about the workings of a pathway is usually the first step in the modeling process. It involves simulating the gene network model under a very large range of conditions, due to the large uncertainty in interactions and kinetic parameters. This makes model exploration highly computational demanding. Furthermore, with no prior information about the model behavior, labor-intensive manual inspection of very large amounts of simulation results becomes necessary. This limits systematic computational exploration to simplistic models. Results We have developed an interactive, smart workflow for model exploration based on semi-supervised learning and human-in-the-loop labeling of data. The workflow lets a modeler rapidly discover ranges of interesting behaviors predicted by the model. Utilizing that similar simulation output is in proximity of each other in a feature space, the modeler can focus on informing the system about what behaviors are more interesting than others by labeling, rather than analyzing simulation results with custom scripts and workflows. This results in a large reduction in time-consuming manual work by the modeler early in a modeling project, which can substantially reduce the time needed to go from an initial model to testable predictions and downstream analysis. Availability and implementation A python-package is available at https://github.com/Wrede/mio.git. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

46

Nasir, Waqas, Alejandro Gomez Toledo, Fredrik Noborn, et al. "SweetNET: A Bioinformatics Workflow for Glycopeptide MS/MS Spectral Analysis." Journal of Proteome Research 15, no. 8 (2016): 2826–40. http://dx.doi.org/10.1021/acs.jproteome.6b00417.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Fiannaca, Antonino, Massimo La Rosa, Salvatore Gaglio, Riccardo Rizzo, and Alfonso Urso. "An ontological-based knowledge organization for bioinformatics workflow management system." EMBnet.journal 18, B (2012): 110. http://dx.doi.org/10.14806/ej.18.b.570.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Li, Jing, Zengliu Su, Ze-Qiang Ma, et al. "A Bioinformatics Workflow for Variant Peptide Detection in Shotgun Proteomics." Molecular & Cellular Proteomics 10, no. 5 (2011): M110.006536. http://dx.doi.org/10.1074/mcp.m110.006536.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Mahoui, Malika, Lingma Lu, Ning Gao, et al. "A Dynamic Workflow Approach for the Integration of Bioinformatics Services." Cluster Computing 8, no. 4 (2005): 279–91. http://dx.doi.org/10.1007/s10586-005-4095-1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Poterlowicz, K., and K. Murat. "475 The bioinformatics workflow for epigenetics profiling of progresing melanoma." Journal of Investigative Dermatology 136, no. 9 (2016): S241. http://dx.doi.org/10.1016/j.jid.2016.06.497.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!