To see the other types of publications on this topic, follow the link: Bioinformatics - Methodology.

Journal articles on the topic 'Bioinformatics - Methodology'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Bioinformatics - Methodology.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gasparovica-Asīte, M., and L. Aleksejeva. "Classification Methodology for Bioinformatics Data Analysis." Automatic Control and Computer Sciences 53, no. 1 (2019): 28–38. http://dx.doi.org/10.3103/s0146411619010073.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hauth, Amy M., and Gertraud Burger. "Methodology for Constructing Problem Definitions in Bioinformatics." Bioinformatics and Biology Insights 2 (January 2008): BBI.S706. http://dx.doi.org/10.4137/bbi.s706.

Full text
Abstract:
Motivation A recurrent criticism is that certain bioinformatics tools do not account for crucial biology and therefore fail answering the targeted biological question. We posit that the single most important reason for such shortcomings is an inaccurate formulation of the computational problem. Results Our paper describes how to define a bioinformatics problem so that it captures both the underlying biology and the computational constraints for a particular problem. The proposed model delineates comprehensively the biological problem and conducts an item-by-item bioinformatics transformation resulting in a germane computational problem. This methodology not only facilitates interdisciplinary information flow but also accommodates emerging knowledge and technologies.
APA, Harvard, Vancouver, ISO, and other styles
3

Saraiya, P., C. North, and K. Duca. "An Insight-Based Methodology for Evaluating Bioinformatics Visualizations." IEEE Transactions on Visualization and Computer Graphics 11, no. 4 (2005): 443–56. http://dx.doi.org/10.1109/tvcg.2005.53.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zvárová, J. "IMIA Conference “Statistical Methodology in Bioinformatics and Clinical Trials”." Methods of Information in Medicine 45, no. 02 (2006): 137–38. http://dx.doi.org/10.1055/s-0038-1634056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ramlo, Susan E., David McConnell, Zhong-Hui Duan, and Francisco B. Moore. "Evaluating an Inquiry-based Bioinformatics Course Using Q Methodology." Journal of Science Education and Technology 17, no. 3 (2008): 219–25. http://dx.doi.org/10.1007/s10956-008-9090-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Theodosiou, T., N. Darzentas, L. Angelis, and C. A. Ouzounis. "PuReD-MCL: a graph-based PubMed document clustering methodology." Bioinformatics 24, no. 17 (2008): 1935–41. http://dx.doi.org/10.1093/bioinformatics/btn318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Schulz, S., E. Beisswanger, L. van den Hoek, O. Bodenreider, and E. M. van Mulligen. "Alignment of the UMLS semantic network with BioTop: methodology and assessment." Bioinformatics 25, no. 12 (2009): i69—i76. http://dx.doi.org/10.1093/bioinformatics/btp194.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Öztürk, Hakime, Elif Ozkirimli, and Arzucan Özgür. "A novel methodology on distributed representations of proteins using their interacting ligands." Bioinformatics 34, no. 13 (2018): i295—i303. http://dx.doi.org/10.1093/bioinformatics/bty287.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Meyer, P., J. Hoeng, J. J. Rice, et al. "Industrial methodology for process verification in research (IMPROVER): toward systems biology verification." Bioinformatics 28, no. 9 (2012): 1193–201. http://dx.doi.org/10.1093/bioinformatics/bts116.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tekwe, C. D., R. J. Carroll, and A. R. Dabney. "Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data." Bioinformatics 28, no. 15 (2012): 1998–2003. http://dx.doi.org/10.1093/bioinformatics/bts306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Gaire, Raj K., James Bailey, Jennifer Bearfoot, Ian G. Campbell, Peter J. Stuckey, and Izhak Haviv. "MIRAGAA—a methodology for finding coordinated effects of microRNA expression changes and genome aberrations in cancer." Bioinformatics 26, no. 2 (2009): 161–67. http://dx.doi.org/10.1093/bioinformatics/btp654.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Weiser, Diána, Flóra Nagy, Gergely Bánóczi, et al. "Immobilization engineering – How to design advanced sol–gel systems for biocatalysis?" Green Chemistry 19, no. 16 (2017): 3927–37. http://dx.doi.org/10.1039/c7gc00896a.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Liu, Ruiyin, Jian Tao, and Dehui Wang. "An Inference Methodology for Selecting and Clustering Genes Based on Likelihood Ratio Test." International Journal of Pattern Recognition and Artificial Intelligence 30, no. 08 (2016): 1650019. http://dx.doi.org/10.1142/s0218001416500191.

Full text
Abstract:
Peddada et al. (Gene selected and clustering for time-course and close-response microarray experiments using order-restricted inference, Bioinformatics 19 (2003): 834–841) proposed a new method for selecting and clustering genes according to their time-course or dose-response profiles. Their method necessitates the assumption of a constant variance through time or among dosages. This homoscedasticity assumption is, however, seldom satisfied in practice. In this paper, via the application of Shi’s algorithms and a modified bootstrap procedure (N. Z. Shi, Maximum likelihood estimation of means and variances from normal populations under simulations order restrictions (J. Multivariate Anal. 50 (1994) 282–293), we proposed a generalized order-restricted inference method which releases the homoscedasticity restriction. Simulation results show that procedures considered in this paper as well as those by Peddada et al. (Gene selected and clustering for time-course and close-response microarray experiments using order-restricted inference, Bioinformatics 19 (2003) 834–841) are generally comparable in terms of Type I error rate while our proposed algorithms are usually more powerful.
APA, Harvard, Vancouver, ISO, and other styles
14

Pauling, Josch K., and Edda Klipp. "Computational Lipidomics and Lipid Bioinformatics: Filling In the Blanks." Journal of Integrative Bioinformatics 13, no. 1 (2016): 34–51. http://dx.doi.org/10.1515/jib-2016-299.

Full text
Abstract:
SummaryLipids are highly diverse metabolites of pronounced importance in health and disease. While metabolomics is a broad field under the omics umbrella that may also relate to lipids, lipidomics is an emerging field which specializes in the identification, quantification and functional interpretation of complex lipidomes. Today, it is possible to identify and distinguish lipids in a high-resolution, high-throughput manner and simultaneously with a lot of structural detail. However, doing so may produce thousands of mass spectra in a single experiment which has created a high demand for specialized computational support to analyze these spectral libraries. The computational biology and bioinformatics community has so far established methodology in genomics, transcriptomics and proteomics but there are many (combinatorial) challenges when it comes to structural diversity of lipids and their identification, quantification and interpretation. This review gives an overview and outlook on lipidomics research and illustrates ongoing computational and bioinformatics efforts. These efforts are important and necessary steps to advance the lipidomics field alongside analytic, biochemistry, biomedical and biology communities and to close the gap in available computational methodology between lipidomics and other omics sub-branches.
APA, Harvard, Vancouver, ISO, and other styles
15

Li, Kening, Yuxin Du, Lu Li, and Dong-Qing Wei. "Bioinformatics Approaches for Anti-cancer Drug Discovery." Current Drug Targets 21, no. 1 (2019): 3–17. http://dx.doi.org/10.2174/1389450120666190923162203.

Full text
Abstract:
Drug discovery is important in cancer therapy and precision medicines. Traditional approaches of drug discovery are mainly based on in vivo animal experiments and in vitro drug screening, but these methods are usually expensive and laborious. In the last decade, omics data explosion provides an opportunity for computational prediction of anti-cancer drugs, improving the efficiency of drug discovery. High-throughput transcriptome data were widely used in biomarkers’ identification and drug prediction by integrating with drug-response data. Moreover, biological network theory and methodology were also successfully applied to the anti-cancer drug discovery, such as studies based on protein-protein interaction network, drug-target network and disease-gene network. In this review, we summarized and discussed the bioinformatics approaches for predicting anti-cancer drugs and drug combinations based on the multi-omic data, including transcriptomics, toxicogenomics, functional genomics and biological network. We believe that the general overview of available databases and current computational methods will be helpful for the development of novel cancer therapy strategies.
APA, Harvard, Vancouver, ISO, and other styles
16

Brunel, Helena, Joan-Josep Gallardo-Chacón, Alfonso Buil, et al. "MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis." Bioinformatics 26, no. 15 (2010): 1811–18. http://dx.doi.org/10.1093/bioinformatics/btq273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Roy, Joy, Eric Cheung, Junaid Bhatti, Abraar Muneem, and Daniel Lobo. "Curation and annotation of planarian gene expression patterns with segmented reference morphologies." Bioinformatics 36, no. 9 (2020): 2881–87. http://dx.doi.org/10.1093/bioinformatics/btaa023.

Full text
Abstract:
Abstract Motivation Morphological and genetic spatial data from functional experiments based on genetic, surgical and pharmacological perturbations are being produced at an extraordinary pace in developmental and regenerative biology. However, our ability to extract knowledge from these large datasets are hindered due to the lack of formalization methods and tools able to unambiguously describe, centralize and interpret them. Formalizing spatial phenotypes and gene expression patterns is especially challenging in organisms with highly variable morphologies such as planarian worms, which due to their extraordinary regenerative capability can experimentally result in phenotypes with almost any combination of body regions or parts. Results Here, we present a computational methodology and mathematical formalism to encode and curate the morphological outcomes and gene expression patterns in planaria. Worm morphologies are encoded with mathematical graphs based on anatomical ontology terms to automatically generate reference morphologies. Gene expression patterns are registered to these standard reference morphologies, which can then be annotated automatically with anatomical ontology terms by analyzing the spatial expression patterns and their textual descriptions. This methodology enables the curation and annotation of complex experimental morphologies together with their gene expression patterns in a centralized standardized dataset, paving the way for the extraction of knowledge and reverse-engineering of the much sought-after mechanistic models in planaria and other regenerative organisms. Availability and implementation We implemented this methodology in a user-friendly graphical software tool, PlanGexQ, freely available together with the data in the manuscript at https://lobolab.umbc.edu/plangexq. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
18

Shavit, Yoli, and Pietro Lio'. "Combining a wavelet change point and the Bayes factor for analysing chromosomal interaction data." Mol. BioSyst. 10, no. 6 (2014): 1576–85. http://dx.doi.org/10.1039/c4mb00142g.

Full text
Abstract:
We provide a step-by-step statistical bioinformatics solution to the analysis of chromosomal interaction data. To the best of our knowledge, there is currently no available methodology for following this entire pipeline.
APA, Harvard, Vancouver, ISO, and other styles
19

Su, Yingfeng, Yingxi Liu, Xiuzhen Sun, et al. "Dynamic Mechanism and Effect of Nasal Cycle on the Warming Function Based on Bioinformatics Methodology." Journal of Medical Imaging and Health Informatics 8, no. 6 (2018): 1147–51. http://dx.doi.org/10.1166/jmihi.2018.2428.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Boulesteix, Anne-Laure, Silke Janitza, Jochen Kruppa, and Inke R. König. "Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2, no. 6 (2012): 493–507. http://dx.doi.org/10.1002/widm.1072.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Boileau, Philippe, Nima S. Hejazi, and Sandrine Dudoit. "Exploring high-dimensional biological data with sparse contrastive principal component analysis." Bioinformatics 36, no. 11 (2020): 3422–30. http://dx.doi.org/10.1093/bioinformatics/btaa176.

Full text
Abstract:
Abstract Motivation Statistical analyses of high-throughput sequencing data have re-shaped the biological sciences. In spite of myriad advances, recovering interpretable biological signal from data corrupted by technical noise remains a prevalent open problem. Several classes of procedures, among them classical dimensionality reduction techniques and others incorporating subject-matter knowledge, have provided effective advances. However, no procedure currently satisfies the dual objectives of recovering stable and relevant features simultaneously. Results Inspired by recent proposals for making use of control data in the removal of unwanted variation, we propose a variant of principal component analysis (PCA), sparse contrastive PCA that extracts sparse, stable, interpretable and relevant biological signal. The new methodology is compared to competing dimensionality reduction approaches through a simulation study and via analyses of several publicly available protein expression, microarray gene expression and single-cell transcriptome sequencing datasets. Availability and implementation A free and open-source software implementation of the methodology, the scPCA R package, is made available via the Bioconductor Project. Code for all analyses presented in this article is also available via GitHub. Contact philippe_boileau@berkeley.edu Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
22

Karatzas, Evangelos, Margarita Zachariou, Marilena M. Bourdakou, et al. "PathWalks: identifying pathway communities using a disease-related map of integrated information." Bioinformatics 36, no. 13 (2020): 4070–79. http://dx.doi.org/10.1093/bioinformatics/btaa291.

Full text
Abstract:
Abstract Motivation Understanding the underlying biological mechanisms and respective interactions of a disease remains an elusive, time consuming and costly task. Computational methodologies that propose pathway/mechanism communities and reveal respective relationships can be of great value as they can help expedite the process of identifying how perturbations in a single pathway can affect other pathways. Results We present a random-walks-based methodology called PathWalks, where a walker crosses a pathway-to-pathway network under the guidance of a disease-related map. The latter is a gene network that we construct by integrating multi-source information regarding a specific disease. The most frequent trajectories highlight communities of pathways that are expected to be strongly related to the disease under study. We apply the PathWalks methodology on Alzheimer's disease and idiopathic pulmonary fibrosis and establish that it can highlight pathways that are also identified by other pathway analysis tools as well as are backed through bibliographic references. More importantly, PathWalks produces additional new pathways that are functionally connected with those already established, giving insight for further experimentation. Availability and implementation https://github.com/vagkaratzas/PathWalks. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
23

Rajan Keshri, Harpreet Kaur, Dhanashri R. Patil, and Nilesh N. Sonawane. "Bioinformatics Analysis of B. Abortus Strain 2308 Protein and Its Drug Docking." International Journal for Research in Applied Sciences and Biotechnology 8, no. 1 (2021): 175–200. http://dx.doi.org/10.31033/ijrasb.8.1.21.

Full text
Abstract:
Brucellosis is among the fast-spreading disease on Earth causing casualties in livestock as well as in humans. It’s an alarming situation to kick off the study of Brucellosis causing agent. Brucellosis is an infective sickness caused by Brucella abortus. Here, we have carried out the Insilco analysis of Brucella abortus strain 2308. This strain is mainly responsible for the disease. Here, we have tried to study the B. abortus strain 2308 by the means of bioinformatics methodology. We have run several Insilco tools to predict its structure and function. Moreover, we have carried out the methodology of protein homology modelling on this strain. Furthermore, we have also carried out several protein chain analyses, protein-protein interface, structure alignment, structure prediction and active site prediction. To make our study more productive we have also performed drug docking. These results will add on a little info in the emerging bioinformatics data regarding Brucella abortus.
APA, Harvard, Vancouver, ISO, and other styles
24

Baig, Abdul Mannan. "Innovative Methodology in the Discovery of Novel Drug Targets in the Free-Living Amoebae." Current Drug Targets 20, no. 1 (2018): 60–69. http://dx.doi.org/10.2174/1389450119666180426100452.

Full text
Abstract:
Despite advances in drug discovery and modifications in the chemotherapeutic regimens, human infections caused by free-living amoebae (FLA) have high mortality rates (~95%). The FLA that cause fatal human cerebral infections include Naegleria fowleri, Balamuthia mandrillaris and Acanthamoeba spp. Novel drug-target discovery remains the only viable option to tackle these central nervous system (CNS) infection in order to lower the mortality rates caused by the FLA. Of these FLA, N. fowleri causes primary amoebic meningoencephalitis (PAM), while the A. castellanii and B. Mandrillaris are known to cause granulomatous amoebic encephalitis (GAE). The infections caused by the FLA have been treated with drugs like Rifampin, Fluconazole, Amphotericin-B and Miltefosine. Miltefosine is an anti-leishmanial agent and an experimental anti-cancer drug. With only rare incidences of success, these drugs have remained unsuccessful to lower the mortality rates of the cerebral infection caused by FLA. Recently, with the help of bioinformatic computational tools and the discovered genomic data of the FLA, discovery of newer drug targets has become possible. These cellular targets are proteins that are either unique to the FLA or shared between the humans and these unicellular eukaryotes. The latter group of proteins has shown to be targets of some FDA approved drugs prescribed in non-infectious diseases. This review out-lines the bioinformatics methodologies that can be used in the discovery of such novel drug-targets, their chronicle by in-vitro assays done in the past and the translational value of such target discoveries in human diseases caused by FLA.
APA, Harvard, Vancouver, ISO, and other styles
25

Yohe, Sophia, and Bharat Thyagarajan. "Review of Clinical Next-Generation Sequencing." Archives of Pathology & Laboratory Medicine 141, no. 11 (2017): 1544–57. http://dx.doi.org/10.5858/arpa.2016-0501-ra.

Full text
Abstract:
Context.— Next-generation sequencing (NGS) is a technology being used by many laboratories to test for inherited disorders and tumor mutations. This technology is new for many practicing pathologists, who may not be familiar with the uses, methodology, and limitations of NGS. Objective.— To familiarize pathologists with several aspects of NGS, including current and expanding uses; methodology including wet bench aspects, bioinformatics, and interpretation; validation and proficiency; limitations; and issues related to the integration of NGS data into patient care. Data Sources.— The review is based on peer-reviewed literature and personal experience using NGS in a clinical setting at a major academic center. Conclusions.— The clinical applications of NGS will increase as the technology, bioinformatics, and resources evolve to address the limitations and improve quality of results. The challenge for clinical laboratories is to ensure testing is clinically relevant, cost-effective, and can be integrated into clinical care.
APA, Harvard, Vancouver, ISO, and other styles
26

Croset, Samuel, Joachim Rupp, and Martin Romacker. "Flexible data integration and curation using a graph-based approach." Bioinformatics 32, no. 6 (2015): 918–25. http://dx.doi.org/10.1093/bioinformatics/btv644.

Full text
Abstract:
Abstract Motivation: The increasing diversity of data available to the biomedical scientist holds promise for better understanding of diseases and discovery of new treatments for patients. In order to provide a complete picture of a biomedical question, data from many different origins needs to be combined into a unified representation. During this data integration process, inevitable errors and ambiguities present in the initial sources compromise the quality of the resulting data warehouse, and greatly diminish the scientific value of the content. Expensive and time-consuming manual curation is then required to improve the quality of the information. However, it becomes increasingly difficult to dedicate and optimize the resources for data integration projects as available repositories are growing both in size and in number everyday. Results: We present a new generic methodology to identify problematic records, causing what we describe as ‘data hairball’ structures. The approach is graph-based and relies on two metrics traditionally used in social sciences: the graph density and the betweenness centrality. We evaluate and discuss these measures and show their relevance for flexible, optimized and automated data curation and linkage. The methodology focuses on information coherence and correctness to improve the scientific meaningfulness of data integration endeavors, such as knowledge bases and large data warehouses. Contact: samuel.croset@roche.com Supplementary information: Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
27

Gilbert, James, Nicole Pearcy, Rupert Norman, et al. "Gsmodutils: a python based framework for test-driven genome scale metabolic model development." Bioinformatics 35, no. 18 (2019): 3397–403. http://dx.doi.org/10.1093/bioinformatics/btz088.

Full text
Abstract:
AbstractMotivationGenome scale metabolic models (GSMMs) are increasingly important for systems biology and metabolic engineering research as they are capable of simulating complex steady-state behaviour. Constraints based models of this form can include thousands of reactions and metabolites, with many crucial pathways that only become activated in specific simulation settings. However, despite their widespread use, power and the availability of tools to aid with the construction and analysis of large scale models, little methodology is suggested for their continued management. For example, when genome annotations are updated or new understanding regarding behaviour is discovered, models often need to be altered to reflect this. This is quickly becoming an issue for industrial systems and synthetic biotechnology applications, which require good quality reusable models integral to the design, build, test and learn cycle.ResultsAs part of an ongoing effort to improve genome scale metabolic analysis, we have developed a test-driven development methodology for the continuous integration of validation data from different sources. Contributing to the open source technology based around COBRApy, we have developed the gsmodutils modelling framework placing an emphasis on test-driven design of models through defined test cases. Crucially, different conditions are configurable allowing users to examine how different designs or curation impact a wide range of system behaviours, minimizing error between model versions.Availability and implementationThe software framework described within this paper is open source and freely available from http://github.com/SBRCNottingham/gsmodutils.Supplementary informationSupplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
28

Gramstad, Oddgeir, Jan Oystein Haavig Bakke, Lars Sonneland, and Carlos Eduardo Abreu. "Simultaneous extraction of stratigraphic sequences using iterative seismic DNA detection." Interpretation 2, no. 4 (2014): T167—T176. http://dx.doi.org/10.1190/int-2014-0069.1.

Full text
Abstract:
We developed a new stratigraphic interpretation methodology enabling extraction of stratigraphic sequences of surfaces. This methodology consisted of two main components. The first component was inspired by DNA search technology used in bioinformatics in which the goal was to detect base-pair sequences in DNA molecules. The base-pair sequences may be modified by mutations, so the search technology must be able to take such mutations into account. We have adapted this technology to search for seismic reflection sequences, acknowledging that seismic reflections were likely to vary laterally that were somewhat analogs to mutations in bioinformatics. One example of a seismic reflection sequence was a geologically sorted sequence of seismic reflectors. The second component of the workflow was how to connect the DNA hits into continuous surfaces. A quality metric was used to steer the connection process of the DNA hits in a sorted order. This workflow enabled the user to extract a sequence of surfaces simultaneously, which is crucial in seismic stratigraphic interpretation.
APA, Harvard, Vancouver, ISO, and other styles
29

Sousa, Sílvia A., António M. M. Seixas, Manoj Mandal, Manuel J. Rodríguez-Ortega, and Jorge H. Leitão. "Characterization of the Burkholderia cenocepacia J2315 Surface-Exposed Immunoproteome." Vaccines 8, no. 3 (2020): 509. http://dx.doi.org/10.3390/vaccines8030509.

Full text
Abstract:
Infections by the Burkholderia cepacia complex (Bcc) remain seriously life threatening to cystic fibrosis (CF) patients, and no effective eradication is available. A vaccine to protect patients against Bcc infections is a highly attractive therapeutic option, but none is available. A strategy combining the bioinformatics identification of putative surface-exposed proteins with an experimental approach encompassing the “shaving” of surface-exposed proteins with trypsin followed by peptide identification by liquid chromatography and mass spectrometry is here reported. The methodology allowed the bioinformatics identification of 263 potentially surface-exposed proteins, 16 of them also experimentally identified by the “shaving” approach. Of the proteins identified, 143 have a high probability of containing B-cell epitopes that are surface-exposed. The immunogenicity of three of these proteins was demonstrated using serum samples from Bcc-infected CF patients and Western blotting, validating the usefulness of this methodology in identifying potentially immunogenic surface-exposed proteins that might be used for the development of Bcc-protective vaccines.
APA, Harvard, Vancouver, ISO, and other styles
30

Srinivasan, Shyam, William R. Cluett, and Radhakrishnan Mahadevan. "A scalable method for parameter identification in kinetic models of metabolism using steady-state data." Bioinformatics 35, no. 24 (2019): 5216–25. http://dx.doi.org/10.1093/bioinformatics/btz445.

Full text
Abstract:
Abstract Motivation In kinetic models of metabolism, the parameter values determine the dynamic behaviour predicted by these models. Estimating parameters from in vivo experimental data require the parameters to be structurally identifiable, and the data to be informative enough to estimate these parameters. Existing methods to determine the structural identifiability of parameters in kinetic models of metabolism can only be applied to models of small metabolic networks due to their computational complexity. Additionally, a priori experimental design, a necessity to obtain informative data for parameter estimation, also does not account for using steady-state data to estimate parameters in kinetic models. Results Here, we present a scalable methodology to structurally identify parameters for each flux in a kinetic model of metabolism based on the availability of steady-state data. In doing so, we also address the issue of determining the number and nature of experiments for generating steady-state data to estimate these parameters. By using a small metabolic network as an example, we show that most parameters in fluxes expressed by mechanistic enzyme kinetic rate laws can be identified using steady-state data, and the steady-state data required for their estimation can be obtained from selective experiments involving both substrate and enzyme level perturbations. The methodology can be used in combination with other identifiability and experimental design algorithms that use dynamic data to determine the most informative experiments requiring the least resources to perform. Availability and implementation https://github.com/LMSE/ident. Supplementary information Supplementary data are available at Bioinformatics online
APA, Harvard, Vancouver, ISO, and other styles
31

Pudlo, Pierre, Jean-Michel Marin, Arnaud Estoup, Jean-Marie Cornuet, Mathieu Gautier, and Christian P. Robert. "Reliable ABC model choice via random forests." Bioinformatics 32, no. 6 (2015): 859–66. http://dx.doi.org/10.1093/bioinformatics/btv684.

Full text
Abstract:
Abstract Motivation: Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. Results: We propose a novel approach based on a machine learning tool named random forests (RF) to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with RF and postponing the approximation of the posterior probability of the selected model for a second stage also relying on RF. Compared with earlier implementations of ABC model choice, the ABC RF approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least 50) and (iv) it includes an approximation of the posterior probability of the selected model. The call to RF will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. Availability and implementation: The proposed methodology is implemented in the R package abcrf available on the CRAN. Contact: jean-michel.marin@umontpellier.fr Supplementary information: Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
32

Matlock, Kevin, Raziur Rahman, Souparno Ghosh, and Ranadip Pal. "Sstack: an R package for stacking with applications to scenarios involving sequential addition of samples and features." Bioinformatics 35, no. 17 (2019): 3143–45. http://dx.doi.org/10.1093/bioinformatics/btz010.

Full text
Abstract:
Abstract Summary Biological processes are characterized by a variety of different genomic feature sets. However, often times when building models, portions of these features are missing for a subset of the dataset. We provide a modeling framework to effectively integrate this type of heterogeneous data to improve prediction accuracy. To test our methodology, we have stacked data from the Cancer Cell Line Encyclopedia to increase the accuracy of drug sensitivity prediction. The package addresses the dynamic regime of information integration involving sequential addition of features and samples. Availability and implementation The framework has been implemented as a R package Sstack, which can be downloaded from https://cran.r-project.org/web/packages/Sstack/index.html, where further explanation of the package is available. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
33

Shegogue, Daniel, and W. Jim Zheng. "Object-oriented biological system integration: a SARS coronavirus example." Bioinformatics 21, no. 10 (2005): 2502–9. http://dx.doi.org/10.1093/bioinformatics/bti344.

Full text
Abstract:
Abstract Motivation The importance of studying biology at the system level has been well recognized, yet there is no well-defined process or consistent methodology to integrate and represent biological information at this level. To overcome this hurdle, a blending of disciplines such as computer science and biology is necessary. Results By applying an adapted, sequential software engineering process, a complex biological system (severe acquired respiratory syndrome-coronavirus viral infection) has been reverse-engineered and represented as an object-oriented software system. The scalability of this object-oriented software engineering approach indicates that we can apply this technology for the integration of large complex biological systems. Availability A navigable web-based version of the system is freely available at http://people.musc.edu/~zhengw/SARS/Software-Process.htm Contact zhengw@musc.edu Supplementary information Supplemental data: Table 1 and Figures 1–16.
APA, Harvard, Vancouver, ISO, and other styles
34

Makris, Christos, Georgios Pispirigos, and Michael Angelos Simos. "Text Semantic Annotation: A Distributed Methodology Based on Community Coherence." Algorithms 13, no. 7 (2020): 160. http://dx.doi.org/10.3390/a13070160.

Full text
Abstract:
Text annotation is the process of identifying the sense of a textual segment within a given context to a corresponding entity on a concept ontology. As the bag of words paradigm’s limitations become increasingly discernible in modern applications, several information retrieval and artificial intelligence tasks are shifting to semantic representations for addressing the inherent natural language polysemy and homonymy challenges. With extensive application in a broad range of scientific fields, such as digital marketing, bioinformatics, chemical engineering, neuroscience, and social sciences, community detection has attracted great scientific interest. Focusing on linguistics, by aiming to identify groups of densely interconnected subgroups of semantic ontologies, community detection application has proven beneficial in terms of disambiguation improvement and ontology enhancement. In this paper we introduce a novel distributed supervised knowledge-based methodology employing community detection algorithms for text annotation with Wikipedia Entities, establishing the unprecedented concept of community Coherence as a metric for local contextual coherence compatibility. Our experimental evaluation revealed that deeper inference of relatedness and local entity community coherence in the Wikipedia graph bears substantial improvements overall via a focus on accuracy amelioration of less common annotations. The proposed methodology is propitious for wider adoption, attaining robust disambiguation performance.
APA, Harvard, Vancouver, ISO, and other styles
35

Mauguen, Audrey, Venkatraman E. Seshan, Colin B. Begg, and Irina Ostrovnaya. "Testing clonal relatedness of two tumors from the same patient based on their mutational profiles: update of the Clonality R package." Bioinformatics 35, no. 22 (2019): 4776–78. http://dx.doi.org/10.1093/bioinformatics/btz486.

Full text
Abstract:
Abstract Summary The Clonality R package is a practical tool to assess the clonal relatedness of two tumors from the same patient. We have previously presented its functionality for testing tumors using loss of heterozygosity data or copy number arrays. Since then somatic mutation data have been more widely available through next generation sequencing and we have developed new methodology for comparing the tumors’ mutational profiles. We thus extended the package to include these two new methods for comparing tumors as well as the mutational frequency estimation from external data required for their implementation. The first method is a likelihood ratio test that is readily available on a patient by patient basis. The second method employs a random-effects model to estimate both the population and individual probabilities of clonal relatedness from a group of patients with pairs of tumors. The package is available on Bioconductor. Availability and implementation Bioconductor (http://bioconductor.org/packages/release/bioc/html/Clonality.html). Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
36

Zingaretti, Laura M., Gilles Renand, Diego P. Morgavi, and Yuliaxis Ramayo-Caldas. "Link-HD: a versatile framework to explore and integrate heterogeneous microbial communities." Bioinformatics 36, no. 7 (2019): 2298–99. http://dx.doi.org/10.1093/bioinformatics/btz862.

Full text
Abstract:
Abstract Motivation We present Link-HD, an approach to integrate multiple datasets. Link-HD is a generalization of ‘Structuration des Tableaux A Trois Indices de la Statistique–Analyse Conjointe de Tableaux’, a family of methods designed to integrate information from heterogeneous data. Here, we extend the classical approach to deal with broader datasets (e.g. compositional data), methods for variable selection and taxon-set enrichment analysis. Results The methodology is demonstrated by integrating rumen microbial communities from cows for which methane yield (CH4y) was individually measured. Our approach reproduces the significant link between rumen microbiota structure and CH4 emission. When analyzing the TARA’s ocean data, Link-HD replicates published results, highlighting the relevance of temperature with members of phyla Proteobacteria on the structure and functionality of this ecosystem. Availability and implementation The source code, examples and a complete manual are freely available in GitHub https://github.com/lauzingaretti/LinkHD and in Bioconductor https://bioconductor.org/packages/release/bioc/html/LinkHD.html.
APA, Harvard, Vancouver, ISO, and other styles
37

Anderson, Warren D., Fabiana M. Duarte, Mete Civelek, and Michael J. Guertin. "Defining data-driven primary transcript annotations with primaryTranscriptAnnotation in R." Bioinformatics 36, no. 9 (2020): 2926–28. http://dx.doi.org/10.1093/bioinformatics/btaa011.

Full text
Abstract:
Abstract Summary Nascent transcript measurements derived from run-on sequencing experiments are critical for the investigation of transcriptional mechanisms and regulatory networks. However, conventional mRNA gene annotations significantly differ from the boundaries of primary transcripts. New primary transcript annotations are needed to accurately interpret run-on data. We developed the primaryTranscriptAnnotation R package to infer the transcriptional start and termination sites of primary transcripts from genomic run-on data. We then used these inferred coordinates to annotate transcriptional units identified de novo. This package provides the novel utility to integrate data-driven primary transcript annotations with transcriptional unit coordinates identified in an unbiased manner. Highlighting the importance of using accurate primary transcript coordinates, we demonstrate that this new methodology increases the detection of differentially expressed transcripts and provides more accurate quantification of RNA polymerase pause indices. Availability and implementation https://github.com/WarrenDavidAnderson/genomicsRpackage/tree/master/primaryTranscriptAnnotation. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
38

Dowsey, Andrew W. "The need for statistical contributions to bioinformatics at scale, with illustration to mass spectrometry." Statistical Modelling 17, no. 4-5 (2017): 290–99. http://dx.doi.org/10.1177/1471082x17708519.

Full text
Abstract:
In their article, Morris and Baladandayuthapani clearly evidence the influence of statisticians in recent methodological advances throughout the bioinformatics pipeline and advocate for the expansion of this role. The latest acquisition platforms, such as next generation sequencing (genomics/transcriptomics) and hyphenated mass spectrometry (proteomics/metabolomics), output raw datasets in the order of gigabytes; it is not unusual to acquire a terabyte or more of data per study. The increasing computational burden this brings is a further impediment against the use of statistically rigorous methodology in the pre-processing stages of the bioinformatics pipeline. In this discussion I describe the mass spectrometry pipeline and use it as an example to show that beneath this challenge lies a two-fold opportunity: (a) Biological complexity and dynamic range is still well beyond what is captured by current processing methodology; hence, potential biomarkers and mechanistic insights are consistently missed; (b) Statistical science could play a larger role in optimizing the acquisition process itself. Data rates will continue to increase as routine clinical omics analysis moves to large-scale facilities with systematic, standardized protocols. Key inferential gains will be achieved by borrowing strength across the sum total of all analyzed studies, a task best underpinned by appropriate statistical modelling.
APA, Harvard, Vancouver, ISO, and other styles
39

Sherif, Hisham M. F. "Dissecting the Dissection." AORTA 03, no. 03 (2015): 108–17. http://dx.doi.org/10.12945/j.aorta.2015.14.060.

Full text
Abstract:
AbstractAortic dissection remains one of the most devastating diseases. Current practice guidelines provide diagnostic and therapeutic interventions based primarily on the aortic diameter. The level of evidence supporting these recommendations is Level C or “Expert Opinion” Since aortic dissection is a catastrophic structural failure, its investigation along the guidelines of accident investigation may offer a useful alternative, utilizing process mapping and root-cause analysis methodology. Since the objective of practice guidelines is to address the risk of serious events, on the utilization of a probabilistic predictive modeling methodology, using bioinformatics tools, may offer a more comprehensive risk assessment.
APA, Harvard, Vancouver, ISO, and other styles
40

Ouyang, Zenhwa, Jan Sargeant, Alison Thomas, et al. "A scoping review of ‘big data’, ‘informatics’, and ‘bioinformatics’ in the animal health and veterinary medical literature." Animal Health Research Reviews 20, no. 1 (2019): 1–18. http://dx.doi.org/10.1017/s1466252319000136.

Full text
Abstract:
AbstractResearch in big data, informatics, and bioinformatics has grown dramatically (Andreu-Perez J, et al., 2015, IEEE Journal of Biomedical and Health Informatics 19, 1193–1208). Advances in gene sequencing technologies, surveillance systems, and electronic medical records have increased the amount of health data available. Unconventional data sources such as social media, wearable sensors, and internet search engine activity have also contributed to the influx of health data. The purpose of this study was to describe how ‘big data’, ‘informatics’, and ‘bioinformatics’ have been used in the animal health and veterinary medical literature and to map and chart publications using these terms through time. A scoping review methodology was used. A literature search of the terms ‘big data’, ‘informatics’, and ‘bioinformatics’ was conducted in the context of animal health and veterinary medicine. Relevance screening on abstract and full-text was conducted sequentially. In order for articles to be relevant, they must have used the words ‘big data’, ‘informatics’, or ‘bioinformatics’ in the title or abstract and full-text and have dealt with one of the major animal species encountered in veterinary medicine. Data items collected for all relevant articles included species, geographic region, first author affiliation, and journal of publication. The study level, study type, and data sources were collected for primary studies. After relevance screening, 1093 were classified. While there was a steady increase in ‘bioinformatics’ articles between 1995 and the end of the study period, ‘informatics’ articles reached their peak in 2012, then declined. The first ‘big data’ publication in animal health and veterinary medicine was in 2012. While few articles used the term ‘big data’ (n = 14), recent growth in ‘big data’ articles was observed. All geographic regions produced publications in ‘informatics’ and ‘bioinformatics’ while only North America, Europe, Asia, and Australia/Oceania produced publications about ‘big data’. ‘Bioinformatics’ primary studies tended to use genetic data and tended to be conducted at the genetic level. In contrast, ‘informatics’ primary studies tended to use non-genetic data sources and conducted at an organismal level. The rapidly evolving definition of ‘big data’ may lead to avoidance of the term.
APA, Harvard, Vancouver, ISO, and other styles
41

Ren, Yan, Siva Sivaganesan, Nicholas A. Clark, et al. "Predicting mechanism of action of cellular perturbations with pathway activity signatures." Bioinformatics 36, no. 18 (2020): 4781–88. http://dx.doi.org/10.1093/bioinformatics/btaa590.

Full text
Abstract:
Abstract Motivation Misregulation of signaling pathway activity is etiologic for many human diseases, and modulating activity of signaling pathways is often the preferred therapeutic strategy. Understanding the mechanism of action (MOA) of bioactive chemicals in terms of targeted signaling pathways is the essential first step in evaluating their therapeutic potential. Changes in signaling pathway activity are often not reflected in changes in expression of pathway genes which makes MOA inferences from transcriptional signatures (TSeses) a difficult problem. Results We developed a new computational method for implicating pathway targets of bioactive chemicals and other cellular perturbations by integrated analysis of pathway network topology, the Library of Integrated Network-based Cellular Signature TSes of genetic perturbations of pathway genes and the TS of the perturbation. Our methodology accurately predicts signaling pathways targeted by the perturbation when current pathway analysis approaches utilizing only the TS of the perturbation fail. Availability and implementation Open source R package paslincs is available at https://github.com/uc-bd2k/paslincs. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
42

Diament, Alon, Iddo Weiner, Noam Shahar, et al. "ChimeraUGEM: unsupervised gene expression modeling in any given organism." Bioinformatics 35, no. 18 (2019): 3365–71. http://dx.doi.org/10.1093/bioinformatics/btz080.

Full text
Abstract:
Abstract Motivation Regulation of the amount of protein that is synthesized from genes has proved to be a serious challenge in terms of analysis and prediction, and in terms of engineering and optimization, due to the large diversity in expression machinery across species. Results To address this challenge, we developed a methodology and a software tool (ChimeraUGEM) for predicting gene expression as well as adapting the coding sequence of a target gene to any host organism. We demonstrate these methods by predicting protein levels in seven organisms, in seven human tissues, and by increasing in vivo the expression of a synthetic gene up to 26-fold in the single-cell green alga Chlamydomonas reinhardtii. The underlying model is designed to capture sequence patterns and regulatory signals with minimal prior knowledge on the host organism and can be applied to a multitude of species and applications. Availability and implementation Source code (MATLAB, C) and binaries are freely available for download for non-commercial use at http://www.cs.tau.ac.il/~tamirtul/ChimeraUGEM/, and supported on macOS, Linux and Windows. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
43

Julienne, Hanna, Huwenbo Shi, Bogdan Pasaniuc, and Hugues Aschard. "RAISS: robust and accurate imputation from summary statistics." Bioinformatics 35, no. 22 (2019): 4837–39. http://dx.doi.org/10.1093/bioinformatics/btz466.

Full text
Abstract:
Abstract Motivation Multi-trait analyses using public summary statistics from genome-wide association studies (GWASs) are becoming increasingly popular. A constraint of multi-trait methods is that they require complete summary data for all traits. Although methods for the imputation of summary statistics exist, they lack precision for genetic variants with small effect size. This is benign for univariate analyses where only variants with large effect size are selected a posteriori. However, it can lead to strong p-value inflation in multi-trait testing. Here we present a new approach that improve the existing imputation methods and reach a precision suitable for multi-trait analyses. Results We fine-tuned parameters to obtain a very high accuracy imputation from summary statistics. We demonstrate this accuracy for variants of all effect sizes on real data of 28 GWAS. We implemented the resulting methodology in a python package specially designed to efficiently impute multiple GWAS in parallel. Availability and implementation The python package is available at: https://gitlab.pasteur.fr/statistical-genetics/raiss, its accompanying documentation is accessible here http://statistical-genetics.pages.pasteur.fr/raiss/. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
44

Sadovsky, M. G. "Information Capacity of Symbol Sequences." Open Systems & Information Dynamics 09, no. 01 (2002): 37–49. http://dx.doi.org/10.1023/a:1014230928565.

Full text
Abstract:
The information capacity of sequences is considered through the calculation of specific entropy of their frequency dictionary. The specific entropy was calculated against the reconstructed dictionary which bears the most probable continuations of shorter strings. The measure developed allows to distinguish the sequences both from the random ones, and those with high level of (rather simple) order. Some applications of the developed methodology to genetics, bioinformatics, and linguistics are discussed.
APA, Harvard, Vancouver, ISO, and other styles
45

Dorgham, Doaa M. Talaat, Nahla A. Belal, and Walid Abdelmoez. "Early Performance Prediction in Bioinformatics Systems Using Palladio Component Modeling." Applied Sciences 11, no. 12 (2021): 5426. http://dx.doi.org/10.3390/app11125426.

Full text
Abstract:
Bioinformatics is a branch of science that uses computers, algorithms, and databases to solve biological problems. To achieve more accurate results, researchers need to use large and complex datasets. Sequence alignment is a well-known field of bioinformatics that allows the comparison of different genomic sequences. The comparative genomics field allows the comparison of different genomic sequences, leading to benefits in areas such as evolutionary biology, agriculture, and human health (e.g., mutation testing connects unknown genes to diseases). However, software engineering best practices, such as software performance engineering, are not taken into consideration in most bioinformatics tools and frameworks, which may lead to serious performance problems. Having an estimate of the software performance in the early phases of the Software Development Life Cycle (SDLC) is beneficial in making better decisions relating to the software design. Software performance engineering provides a reliable and observable method to build systems that can achieve their required performance goals. In this paper, we introduce the use of the Palladio Component Modeling (PCM) methodology to predict the performance of a sequence alignment system. Software performance engineering was not considered during the original system development. As a result of the performance analysis, an alternative design is proposed. Comparing the performance of the proposed design against the one already developed, a better response time is obtained. The response time of the usage scenario is reduced from 16 to 8.6 s. The study results show that using performance models at early stages in bioinformatics systems can help to achieve better software system performance.
APA, Harvard, Vancouver, ISO, and other styles
46

Gao, Xin, Deqing Hu, Madelaine Gogol, and Hua Li. "ClusterMap: compare multiple single cell RNA-Seq datasets across different experimental conditions." Bioinformatics 35, no. 17 (2019): 3038–45. http://dx.doi.org/10.1093/bioinformatics/btz024.

Full text
Abstract:
Abstract Motivation Single cell RNA-Seq (scRNA-Seq) facilitates the characterization of cell type heterogeneity and developmental processes. Further study of single cell profiles across different conditions enables the understanding of biological processes and underlying mechanisms at the sub-population level. However, developing proper methodology to compare multiple scRNA-Seq datasets remains challenging. Results We have developed ClusterMap, a systematic method and workflow to facilitate the comparison of scRNA-seq profiles across distinct biological contexts. Using hierarchical clustering of the marker genes of each sub-group, ClusterMap matches the sub-types of cells across different samples and provides ‘similarity’ as a metric to quantify the quality of the match. We introduce a purity tree cut method designed specifically for this matching problem. We use Circos plot and regrouping method to visualize the results concisely. Furthermore, we propose a new metric ‘separability’ to summarize sub-population changes among all sample pairs. In the case studies, we demonstrate that ClusterMap has the ability to provide us further insight into the different molecular mechanisms of cellular sub-populations across different conditions. Availability and implementation ClusterMap is implemented in R and available at https://github.com/xgaoo/ClusterMap. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
47

Ponzoni, Luca, Daniel A. Peñaherrera, Zoltán N. Oltvai, and Ivet Bahar. "Rhapsody: predicting the pathogenicity of human missense variants." Bioinformatics 36, no. 10 (2020): 3084–92. http://dx.doi.org/10.1093/bioinformatics/btaa127.

Full text
Abstract:
Abstract Motivation The biological effects of human missense variants have been studied experimentally for decades but predicting their effects in clinical molecular diagnostics remains challenging. Available computational tools are usually based on the analysis of sequence conservation and structural properties of the mutant protein. We recently introduced a new machine learning method that demonstrated for the first time the significance of protein dynamics in determining the pathogenicity of missense variants. Results Here, we present a new interface (Rhapsody) that enables fully automated assessment of pathogenicity, incorporating both sequence coevolution data and structure- and dynamics-based features. Benchmarked against a dataset of about 20 000 annotated variants, the methodology is shown to outperform well-established and/or advanced prediction tools. We illustrate the utility of Rhapsody by in silico saturation mutagenesis studies of human H-Ras, phosphatase and tensin homolog and thiopurine S-methyltransferase. Availability and implementation The new tool is available both as an online webserver at http://rhapsody.csb.pitt.edu and as an open-source Python package (GitHub repository: https://github.com/prody/rhapsody; PyPI package installation: pip install prody-rhapsody). Links to additional resources, tutorials and package documentation are provided in the 'Python package' section of the website. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
48

Jenkinson, Garrett, Yang I. Li, Shubham Basu, Margot A. Cousin, Gavin R. Oliver, and Eric W. Klee. "LeafCutterMD: an algorithm for outlier splicing detection in rare diseases." Bioinformatics 36, no. 17 (2020): 4609–15. http://dx.doi.org/10.1093/bioinformatics/btaa259.

Full text
Abstract:
Abstract Motivation Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. Results We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. Availability and implementation The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/leafcutter. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
49

Liu, Lin, and Hao Wang. "The Recent Applications and Developments of Bioinformatics and Omics Technologies in Traditional Chinese Medicine." Current Bioinformatics 14, no. 3 (2019): 200–210. http://dx.doi.org/10.2174/1574893614666190102125403.

Full text
Abstract:
Background:Traditional Chinese Medicine (TCM) is widely utilized as complementary health care in China whose acceptance is still hindered by conventional scientific research methodology, although it has been exercised and implemented for nearly 2000 years. Identifying the molecular mechanisms, targets and bioactive components in TCM is a critical step in the modernization of TCM because of the complexity and uniqueness of the TCM system. With recent advances in computational approaches and high throughput technologies, it has become possible to understand the potential TCM mechanisms at the molecular and systematic level, to evaluate the effectiveness and toxicity of TCM treatments. Bioinformatics is gaining considerable attention to unearth the in-depth molecular mechanisms of TCM, which emerges as an interdisciplinary approach owing to the explosive omics data and development of computer science. Systems biology, based on the omics techniques, opens up a new perspective which enables us to investigate the holistic modulation effect on the body.Objective:This review aims to sum up the recent efforts of bioinformatics and omics techniques in the research of TCM including Systems biology, Metabolomics, Proteomics, Genomics and Transcriptomics.Conclusion:Overall, bioinformatics tools combined with omics techniques have been extensively used to scientifically support the ancient practice of TCM to be scientific and international through the acquisition, storage and analysis of biomedical data.
APA, Harvard, Vancouver, ISO, and other styles
50

Ma, Jing, Alla Karnovsky, Farsad Afshinnia, et al. "Differential network enrichment analysis reveals novel lipid pathways in chronic kidney disease." Bioinformatics 35, no. 18 (2019): 3441–52. http://dx.doi.org/10.1093/bioinformatics/btz114.

Full text
Abstract:
Abstract Motivation Functional enrichment testing methods can reduce data comprising hundreds of altered biomolecules to smaller sets of altered biological ‘concepts’ that help generate testable hypotheses. This study leveraged differential network enrichment analysis methodology to identify and validate lipid subnetworks that potentially differentiate chronic kidney disease (CKD) by severity or progression. Results We built a partial correlation interaction network, identified highly connected network components, applied network-based gene-set analysis to identify differentially enriched subnetworks, and compared the subnetworks in patients with early-stage versus late-stage CKD. We identified two subnetworks ‘triacylglycerols’ and ‘cardiolipins-phosphatidylethanolamines (CL-PE)’ characterized by lower connectivity, and a higher abundance of longer polyunsaturated triacylglycerols in patients with severe CKD (stage ≥4) from the Clinical Phenotyping Resource and Biobank Core. These finding were replicated in an independent cohort, the Chronic Renal Insufficiency Cohort. Using an innovative method for elucidating biological alterations in lipid networks, we demonstrated alterations in triacylglycerols and cardiolipins-phosphatidylethanolamines that precede the clinical outcome of end-stage kidney disease by several years. Availability and implementation A complete list of NetGSA results in HTML format can be found at http://metscape.ncibi.org/netgsa/12345-022118/cric_cprobe/022118/results_cric_cprobe/main.html. The DNEA is freely available at https://github.com/wiggie/DNEA. Java wrapper leveraging the cytoscape.js framework is available at http://js.cytoscape.org. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography