Log in

Relevant bibliographies by topics / Bioinformatics predictions / Journal articles

To see the other types of publications on this topic, follow the link: Bioinformatics predictions.

Journal articles on the topic 'Bioinformatics predictions'

Author: Grafiati

Published: 4 June 2021

Last updated: 1 February 2022

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Bioinformatics predictions.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Vangaveti, Sweta, Thom Vreven, Yang Zhang, and Zhiping Weng. "Integrating ab initio and template-based algorithms for protein–protein complex structure prediction." Bioinformatics 36, no. 3 (2019): 751–57. http://dx.doi.org/10.1093/bioinformatics/btz623.

Full text

Abstract:

Abstract Motivation Template-based and template-free methods have both been widely used in predicting the structures of protein–protein complexes. Template-based modeling is effective when a reliable template is available, while template-free methods are required for predicting the binding modes or interfaces that have not been previously observed. Our goal is to combine the two methods to improve computational protein–protein complex structure prediction. Results Here, we present a method to identify and combine high-confidence predictions of a template-based method (SPRING) with a template-free method (ZDOCK). Cross-validated using the protein–protein docking benchmark version 5.0, our method (ZING) achieved a success rate of 68.2%, outperforming SPRING and ZDOCK, with success rates of 52.1% and 35.9% respectively, when the top 10 predictions were considered per test case. In conclusion, a statistics-based method that evaluates and integrates predictions from template-based and template-free methods is more successful than either method independently. Availability and implementation ZING is available for download as a Github repository (https://github.com/weng-lab/ZING.git). Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

2

Sommer, I., J. Rahnenfuhrer, F. S. Domingues, U. de Lichtenberg, and T. Lengauer. "Predicting protein structure classes from function predictions." Bioinformatics 20, no. 5 (2004): 770–76. http://dx.doi.org/10.1093/bioinformatics/btg483.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Wang, Debby D., Haoran Xie, and Hong Yan. "Proteo-chemometrics interaction fingerprints of protein–ligand complexes predict binding affinity." Bioinformatics 37, no. 17 (2021): 2570–79. http://dx.doi.org/10.1093/bioinformatics/btab132.

Full text

Abstract:

Abstract Motivation Reliable predictive models of protein–ligand binding affinity are required in many areas of biomedical research. Accurate prediction based on current descriptors or molecular fingerprints (FPs) remains a challenge. We develop novel interaction FPs (IFPs) to encode protein–ligand interactions and use them to improve the prediction. Results Proteo-chemometrics IFPs (PrtCmm IFPs) formed by combining extended connectivity fingerprints (ECFPs) with the proteo-chemometrics concept. Combining PrtCmm IFPs with machine-learning models led to efficient scoring models, which were validated on the PDBbind v2019 core set and CSAR-HiQ sets. The PrtCmm IFP Score outperformed several other models in predicting protein–ligand binding affinities. Besides, conventional ECFPs were simplified to generate new IFPs, which provided consistent but faster predictions. The relationship between the base atom properties of ECFPs and the accuracy of predictions was also investigated. Availability PrtCmm IFP has been implemented in the IFP Score Toolkit on github (https://github.com/debbydanwang/IFPscore). Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

4

Magnusson, Rasmus, and Mika Gustafsson. "LiPLike: towards gene regulatory network predictions of high certainty." Bioinformatics 36, no. 8 (2020): 2522–29. http://dx.doi.org/10.1093/bioinformatics/btz950.

Full text

Abstract:

Abstract Motivation High correlation in expression between regulatory elements is a persistent obstacle for the reverse-engineering of gene regulatory networks. If two potential regulators have matching expression patterns, it becomes challenging to differentiate between them, thus increasing the risk of false positive identifications. Results To allow for gene regulation predictions of high confidence, we propose a novel method, the Linear Profile Likelihood (LiPLike), that assumes a regression model and iteratively searches for interactions that cannot be replaced by a linear combination of other predictors. To compare the performance of LiPLike with other available inference methods, we benchmarked LiPLike using three independent datasets from the Dialogue on Reverse Engineering Assessment and Methods 5 (DREAM5) network inference challenge. We found that LiPLike could be used to stratify predictions of other inference tools, and when applied to the predictions of DREAM5 participants, we observed an average improvement in accuracy of &gt;140% compared to individual methods. Furthermore, LiPLike was able to independently predict networks better than all DREAM5 participants when applied to biological data. When predicting the Escherichia coli network, LiPLike had an accuracy of 0.38 for the top-ranked 100 interactions, whereas the corresponding DREAM5 consensus model yielded an accuracy of 0.11. Availability and implementation We made LiPLike available to the community as a Python toolbox, available at https://gitlab.com/Gustafsson-lab/liplike. We believe that LiPLike will be used for high confidence predictions in studies where individual model interactions are of high importance, and to remove false positive predictions made by other state-of-the-art gene–gene regulation prediction tools. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

5

Liu, Zhi-Ping. "Predicting lncRNA-protein Interactions by Machine Learning Methods: A Review." Current Bioinformatics 15, no. 8 (2021): 831–40. http://dx.doi.org/10.2174/1574893615666200224095925.

Full text

Abstract:

In this work, a review of predicting lncRNA-protein interactions by bioinformatics methods is provided with a focus on machine learning. Firstly, a computational framework for predicting lncRNA-protein interactions is presented. Then, the currently available data resources for the predictions have been listed. The existing methods will be reviewed by introducing their crucial steps in the prediction framework. The key functions of lncRNA, e.g., mediator on transcriptional regulation, are often involved in interacting with proteins. The interactions with proteins provide a tunnel of leveraging the molecular cooperativity for fulfilling crucial functions. Thus, the important directions in bioinformatics have been highlighted for identifying essential lncRNA-protein interactions and deciphering the dysfunctional importance of lncRNA, especially in carcinogenesis.

APA, Harvard, Vancouver, ISO, and other styles

6

Xu, Gang, Qinghua Wang, and Jianpeng Ma. "OPUS-TASS: a protein backbone torsion angles and secondary structure predictor based on ensemble neural networks." Bioinformatics 36, no. 20 (2020): 5021–26. http://dx.doi.org/10.1093/bioinformatics/btaa629.

Full text

Abstract:

Abstract Motivation Predictions of protein backbone torsion angles (ϕ and ψ) and secondary structure from sequence are crucial subproblems in protein structure prediction. With the development of deep learning approaches, their accuracies have been significantly improved. To capture the long-range interactions, most studies integrate bidirectional recurrent neural networks into their models. In this study, we introduce and modify a recently proposed architecture named Transformer to capture the interactions between the two residues theoretically with arbitrary distance. Moreover, we take advantage of multitask learning to improve the generalization of neural network by introducing related tasks into the training process. Similar to many previous studies, OPUS-TASS uses an ensemble of models and achieves better results. Results OPUS-TASS uses the same training and validation sets as SPOT-1D. We compare the performance of OPUS-TASS and SPOT-1D on TEST2016 (1213 proteins) and TEST2018 (250 proteins) proposed in the SPOT-1D paper, CASP12 (55 proteins), CASP13 (32 proteins) and CASP-FM (56 proteins) proposed in the SAINT paper, and a recently released PDB structure collection from CAMEO (93 proteins) named as CAMEO93. On these six test sets, OPUS-TASS achieves consistent improvements in both backbone torsion angles prediction and secondary structure prediction. On CAMEO93, SPOT-1D achieves the mean absolute errors of 16.89 and 23.02 for ϕ and ψ predictions, respectively, and the accuracies for 3- and 8-state secondary structure predictions are 87.72 and 77.15%, respectively. In comparison, OPUS-TASS achieves 16.56 and 22.56 for ϕ and ψ predictions, and 89.06 and 78.87% for 3- and 8-state secondary structure predictions, respectively. In particular, after using our torsion angles refinement method OPUS-Refine as the post-processing procedure for OPUS-TASS, the mean absolute errors for final ϕ and ψ predictions are further decreased to 16.28 and 21.98, respectively. Availability and implementation The training and the inference codes of OPUS-TASS and its data are available at https://github.com/thuxugang/opus_tass. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

7

López de Frutos, Laura, Jorge J. Cebolla, Pilar Irún, Ralf Köhler, and Pilar Giraldo. "Web-Based Bioinformatics Predictors: Recommendations to Assess Lysosomal Cholesterol Trafficking Diseases-Related Genes." Methods of Information in Medicine 58, no. 01 (2019): 050–59. http://dx.doi.org/10.1055/s-0039-1692463.

Full text

Abstract:

Introduction The growing number of genetic variants of unknown significance (VUS) and availability of several in silico prediction tools make the evaluation of potentially deleterious gene variants challenging. Materials and Methods We evaluated several programs and software to determine the one that can predict the impact of genetic variants found in lysosomal storage disorders (LSDs) caused by defects in cholesterol trafficking best. We evaluated the sensitivity, specificity, accuracy, precision, and Matthew's correlation coefficient of the most common software. Results Our findings showed that for exonic variants, only MutPred1 reached 100% accuracy and generated the best predictions (sensitivity and accuracy = 1.00), whereas intronic variants, SROOGLE or Human Splicing Finder (HSF) generated the best predictions (sensitivity = 1.00, and accuracy = 1.00). Discussion Next-generation sequencing substantially increased the number of detected genetic variants, most of which were considered to be VUS, creating a need for accurate pathogenicity prediction. The focus of the present study is the importance of accurately predicting LSDs, with majority of previously unreported specific mutations. Conclusion We found that the best prediction tool for the NPC1, NPC2, and LIPA variants was MutPred1 for exonic regions and HSF and SROOGLE for intronic regions and splice sites.

APA, Harvard, Vancouver, ISO, and other styles

8

Zhang, Tong, Yu Tian, Le Yuan, Fu Chen, Ailin Ren, and Qian-Nan Hu. "Bio2Rxn: sequence-based enzymatic reaction predictions by a consensus strategy." Bioinformatics 36, no. 11 (2020): 3600–3601. http://dx.doi.org/10.1093/bioinformatics/btaa135.

Full text

Abstract:

Abstract Summary The development of sequencing technologies has generated large amounts of protein sequence data. The automated prediction of the enzymatic reactions of uncharacterized proteins is a major challenge in the field of bioinformatics. Here, we present Bio2Rxn as a web-based tool to provide putative enzymatic reaction predictions for uncharacterized protein sequences. Bio2Rxn adopts a consensus strategy by incorporating six types of enzyme prediction tools. It allows for the efficient integration of these computational resources to maximize the accuracy and comprehensiveness of enzymatic reaction predictions, which facilitates the characterization of the functional roles of target proteins in metabolism. Bio2Rxn further links the enzyme function prediction with more than 300 000 enzymatic reactions, which were manually curated by more than 100 people over the past 9 years from more than 580 000 publications. Availability and implementation Bio2Rxn is available at: http://design.rxnfinder.org/bio2rxn/. Contact qnhu@sibs.ac.cn Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

9

Johansson-Åkhe, Isak, Claudio Mirabello, and Björn Wallner. "InterPep2: global peptide–protein docking using interaction surface templates." Bioinformatics 36, no. 8 (2020): 2458–65. http://dx.doi.org/10.1093/bioinformatics/btaa005.

Full text

Abstract:

Abstract Motivation Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability. Results InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18). Availability and implementation The program is available from: http://wallnerlab.org/InterPep2. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

10

Weis, Caroline, Max Horn, Bastian Rieck, Aline Cuénod, Adrian Egli, and Karsten Borgwardt. "Topological and kernel-based microbial phenotype prediction from MALDI-TOF mass spectra." Bioinformatics 36, Supplement_1 (2020): i30—i38. http://dx.doi.org/10.1093/bioinformatics/btaa429.

Full text

Abstract:

Abstract Motivation Microbial species identification based on matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry (MS) has become a standard tool in clinical microbiology. The resulting MALDI-TOF mass spectra also harbour the potential to deliver prediction results for other phenotypes, such as antibiotic resistance. However, the development of machine learning algorithms specifically tailored to MALDI-TOF MS-based phenotype prediction is still in its infancy. Moreover, current spectral pre-processing typically involves a parameter-heavy chain of operations without analyzing their influence on the prediction results. In addition, classification algorithms lack quantification of uncertainty, which is indispensable for predictions potentially influencing patient treatment. Results We present a novel prediction method for antimicrobial resistance based on MALDI-TOF mass spectra. First, we compare the complex conventional pre-processing to a new approach that exploits topological information and requires only a single parameter, namely the number of peaks of a spectrum to keep. Second, we introduce PIKE, the peak information kernel, a similarity measure specifically tailored to MALDI-TOF mass spectra which, combined with a Gaussian process classifier, provides well-calibrated uncertainty estimates about predictions. We demonstrate the utility of our approach by predicting antibiotic resistance of three clinically highly relevant bacterial species. Our method consistently outperforms competitor approaches, while demonstrating improved performance and security by rejecting out-of-distribution samples, such as bacterial species that are not represented in the training data. Ultimately, our method could contribute to an earlier and precise antimicrobial treatment in clinical patient care. Availability and implementation We make our code publicly available as an easy-to-use Python package under https://github.com/BorgwardtLab/maldi_PIKE.

APA, Harvard, Vancouver, ISO, and other styles

11

Craveiro Sarmento, Aquiles Sales, Lázaro Batista de Azevedo Medeiros, Lucymara Fassarella Agnez-Lima, Josivan Gomes Lima, and Julliane Tamara Araújo de Melo Campos. "Exploring Seipin: From Biochemistry to Bioinformatics Predictions." International Journal of Cell Biology 2018 (September 19, 2018): 1–21. http://dx.doi.org/10.1155/2018/5207608.

Full text

Abstract:

Seipin is a nonenzymatic protein encoded by the BSCL2 gene. It is involved in lipodystrophy and seipinopathy diseases. Named in 2001, all seipin functions are still far from being understood. Therefore, we reviewed much of the research, trying to find a pattern that could explain commonly observed features of seipin expression disorders. Likewise, this review shows how this protein seems to have tissue-specific functions. In an integrative view, we conclude by proposing a theoretical model to explain how seipin might be involved in the triacylglycerol synthesis pathway.

APA, Harvard, Vancouver, ISO, and other styles

12

Li, Wenbo, WeijunGuan, and Mansheng Li. "Bioinformatics Predictions of Fas Related Protein Network." Journal of Computational and Theoretical Nanoscience 13, no. 1 (2016): 830–35. http://dx.doi.org/10.1166/jctn.2016.4881.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Bradley, Thomas, and Simon Moxon. "FilTar: using RNA-Seq data to improve microRNA target prediction accuracy in animals." Bioinformatics 36, no. 8 (2020): 2410–16. http://dx.doi.org/10.1093/bioinformatics/btaa007.

Full text

Abstract:

Abstract Motivation MicroRNA (miRNA) target prediction algorithms do not generally consider biological context and therefore generic target prediction based on seed binding can lead to a high level of false-positive predictions. Here, we present FilTar, a method that incorporates RNA-Seq data to make miRNA target prediction specific to a given cell type or tissue of interest. Results We demonstrate that FilTar can be used to: (i) provide sample specific 3′-UTR reannotation; extending or truncating default annotations based on RNA-Seq read evidence and (ii) filter putative miRNA target predictions by transcript expression level, thus removing putative interactions where the target transcript is not expressed in the tissue or cell line of interest. We test the method on a variety of miRNA transfection datasets and demonstrate increased accuracy versus generic miRNA target prediction methods. Availability and implementation FilTar is freely available and can be downloaded from https://github.com/TBradley27/FilTar. The tool is implemented using the Python and R programming languages, and is supported on GNU/Linux operating systems. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

14

Mi, Zhibao, Kui Shen, Nan Song, et al. "Module-based prediction approach for robust inter-study predictions in microarray data." Bioinformatics 26, no. 20 (2010): 2586–93. http://dx.doi.org/10.1093/bioinformatics/btq472.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Michel, Mirco, David Menéndez Hurtado, Karolis Uziela, and Arne Elofsson. "Large-scale structure prediction by improved contact predictions and model quality assessment." Bioinformatics 33, no. 14 (2017): i23—i29. http://dx.doi.org/10.1093/bioinformatics/btx239.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Wade, Bruce A., Krishnendu Ghosh, and Peter J. Tonellato. "Optimization of a Gene Analysis Application." Computing Letters 2, no. 1-2 (2006): 81–88. http://dx.doi.org/10.1163/157404006777491927.

Full text

Abstract:

MetaGene is a software environment for gene analysis developed at the Bioinformatics Research Center, Medical College of Wisconsin. In this work, a new neural network optimization module is developed to enhance the prediction of gene features developed by MetaGene. The input of the neural network consists of gene feature predictions from several gene analysis engines used by MetaGene. When compared, these predictions are often in conflict. The output from the neural net is a synthesis of these individual predictions taking into account the degree of conflict detected. This optimized prediction provides a more accurate answer when compared to the default prediction of MetaGene or any single prediction engine’s solution.

APA, Harvard, Vancouver, ISO, and other styles

17

Zhang, Jian, and Lukasz Kurgan. "SCRIBER: accurate and partner type-specific prediction of protein-binding residues from proteins sequences." Bioinformatics 35, no. 14 (2019): i343—i353. http://dx.doi.org/10.1093/bioinformatics/btz324.

Full text

Abstract:

AbstractMotivationAccurate predictions of protein-binding residues (PBRs) enhances understanding of molecular-level rules governing protein–protein interactions, helps protein–protein docking and facilitates annotation of protein functions. Recent studies show that current sequence-based predictors of PBRs severely cross-predict residues that interact with other types of protein partners (e.g. RNA and DNA) as PBRs. Moreover, these methods are relatively slow, prohibiting genome-scale use.ResultsWe propose a novel, accurate and fast sequence-based predictor of PBRs that minimizes the cross-predictions. Our SCRIBER (SeleCtive pRoteIn-Binding rEsidue pRedictor) method takes advantage of three innovations: comprehensive dataset that covers multiple types of binding residues, novel types of inputs that are relevant to the prediction of PBRs, and an architecture that is tailored to reduce the cross-predictions. The dataset includes complete protein chains and offers improved coverage of binding annotations that are transferred from multiple protein–protein complexes. We utilize innovative two-layer architecture where the first layer generates a prediction of protein-binding, RNA-binding, DNA-binding and small ligand-binding residues. The second layer re-predicts PBRs by reducing overlap between PBRs and the other types of binding residues produced in the first layer. Empirical tests on an independent test dataset reveal that SCRIBER significantly outperforms current predictors and that all three innovations contribute to its high predictive performance. SCRIBER reduces cross-predictions by between 41% and 69% and our conservative estimates show that it is at least 3 times faster. We provide putative PBRs produced by SCRIBER for the entire human proteome and use these results to hypothesize that about 14% of currently known human protein domains bind proteins.Availability and implementationSCRIBER webserver is available at http://biomine.cs.vcu.edu/servers/SCRIBER/.Supplementary informationSupplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

18

Kuijjer, Marieke L., Maud Fagny, Alessandro Marin, John Quackenbush, and Kimberly Glass. "PUMA: PANDA Using MicroRNA Associations." Bioinformatics 36, no. 18 (2020): 4765–73. http://dx.doi.org/10.1093/bioinformatics/btaa571.

Full text

Abstract:

Abstract Motivation Conventional methods to analyze genomic data do not make use of the interplay between multiple factors, such as between microRNAs (miRNAs) and the messenger RNA (mRNA) transcripts they regulate, and thereby often fail to identify the cellular processes that are unique to specific tissues. We developed PUMA (PANDA Using MicroRNA Associations), a computational tool that uses message passing to integrate a prior network of miRNA target predictions with target gene co-expression information to model genome-wide gene regulation by miRNAs. We applied PUMA to 38 tissues from the Genotype-Tissue Expression project, integrating RNA-Seq data with two different miRNA target predictions priors, built on predictions from TargetScan and miRanda, respectively. We found that while target predictions obtained from these two different resources are considerably different, PUMA captures similar tissue-specific miRNA–target regulatory interactions in the different network models. Furthermore, the tissue-specific functions of miRNAs we identified based on regulatory profiles (available at: https://kuijjer.shinyapps.io/puma_gtex/) are highly similar between networks modeled on the two target prediction resources. This indicates that PUMA consistently captures important tissue-specific miRNA regulatory processes. In addition, using PUMA we identified miRNAs regulating important tissue-specific processes that, when mutated, may result in disease development in the same tissue. Availability and implementation PUMA is available in C++, MATLAB and Python on GitHub (https://github.com/kuijjerlab and https://netzoo.github.io/). Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

19

Li, Wenbo, Weijun Guan, and Mansheng Li. "Bioinformatics Predictions of Caspase-8 Related Protein Network." Journal of Computational and Theoretical Nanoscience 13, no. 3 (2016): 1635–39. http://dx.doi.org/10.1166/jctn.2016.5092.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Charlier, Jeremy, Robert Nadon, and Vladimir Makarenkov. "Accurate deep learning off-target prediction with novel sgRNA-DNA sequence encoding in CRISPR-Cas9 gene editing." Bioinformatics 37, no. 16 (2021): 2299–307. http://dx.doi.org/10.1093/bioinformatics/btab112.

Full text

Abstract:

Abstract Motivation Off-target predictions are crucial in gene editing research. Recently, significant progress has been made in the field of prediction of off-target mutations, particularly with CRISPR-Cas9 data, thanks to the use of deep learning. CRISPR-Cas9 is a gene editing technique which allows manipulation of DNA fragments. The sgRNA-DNA (single guide RNA-DNA) sequence encoding for deep neural networks, however, has a strong impact on the prediction accuracy. We propose a novel encoding of sgRNA-DNA sequences that aggregates sequence data with no loss of information. Results In our experiments, we compare the proposed sgRNA-DNA sequence encoding applied in a deep learning prediction framework with state-of-the-art encoding and prediction methods. We demonstrate the superior accuracy of our approach in a simulation study involving Feedforward Neural Networks (FNNs), Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) as well as the traditional Random Forest (RF), Naive Bayes (NB) and Logistic Regression (LR) classifiers. We highlight the quality of our results by building several FNNs, CNNs and RNNs with various layer depths and performing predictions on two popular gene editing datasets (CRISPOR and GUIDE-seq). In all our experiments, the new encoding led to more accurate off-target prediction results, providing an improvement of the area under the Receiver Operating Characteristic (ROC) curve up to 35%. Availability and implementation The code and data used in this study are available at: https://github.com/dagrate/dl-offtarget. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

21

Dai, Bowen, and Chris Bailey-Kellogg. "Protein interaction interface region prediction by geometric deep learning." Bioinformatics 37, no. 17 (2021): 2580–88. http://dx.doi.org/10.1093/bioinformatics/btab154.

Full text

Abstract:

Abstract Motivation Protein–protein interactions drive wide-ranging molecular processes, and characterizing at the atomic level how proteins interact (beyond just the fact that they interact) can provide key insights into understanding and controlling this machinery. Unfortunately, experimental determination of three-dimensional protein complex structures remains difficult and does not scale to the increasingly large sets of proteins whose interactions are of interest. Computational methods are thus required to meet the demands of large-scale, high-throughput prediction of how proteins interact, but unfortunately, both physical modeling and machine learning methods suffer from poor precision and/or recall. Results In order to improve performance in predicting protein interaction interfaces, we leverage the best properties of both data- and physics-driven methods to develop a unified Geometric Deep Neural Network, ‘PInet’ (Protein Interface Network). PInet consumes pairs of point clouds encoding the structures of two partner proteins, in order to predict their structural regions mediating interaction. To make such predictions, PInet learns and utilizes models capturing both geometrical and physicochemical molecular surface complementarity. In application to a set of benchmarks, PInet simultaneously predicts the interface regions on both interacting proteins, achieving performance equivalent to or even much better than the state-of-the-art predictor for each dataset. Furthermore, since PInet is based on joint segmentation of a representation of a protein surfaces, its predictions are meaningful in terms of the underlying physical complementarity driving molecular recognition. Availability and implementation PInet scripts and models are available at https://github.com/FTD007/PInet. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

22

Kuriata, Aleksander, Valentin Iglesias, Mateusz Kurcinski, Salvador Ventura, and Sebastian Kmiecik. "Aggrescan3D standalone package for structure-based prediction of protein aggregation properties." Bioinformatics 35, no. 19 (2019): 3834–35. http://dx.doi.org/10.1093/bioinformatics/btz143.

Full text

Abstract:

Abstract Summary Aggrescan3D (A3D) standalone is a multiplatform Python package for structure-based prediction of protein aggregation properties and rational design of protein solubility. A3D allows the re-design of protein solubility by combining structural aggregation propensity and stability predictions, as demonstrated by a recent experimental study. It also enables predicting the impact of protein conformational fluctuations on the aggregation properties. The standalone A3D version is an upgrade of the original web server implementation—it introduces a number of customizable options, automated analysis of multiple mutations and offers a flexible computational framework for merging it with other computational tools. Availability and implementation A3D standalone is distributed under the MIT license, which is free for academic and non-profit users. It is implemented in Python. The A3D standalone source code, wiki with documentation and examples of use, and installation instructions for Linux, macOS and Windows are available in the A3D standalone repository at https://bitbucket.org/lcbio/aggrescan3d.

APA, Harvard, Vancouver, ISO, and other styles

23

Kumar, Narender, Kathy E. Raven, Beth Blane, et al. "Evaluation of a fully automated bioinformatics tool to predict antibiotic resistance from MRSA genomes." Journal of Antimicrobial Chemotherapy 75, no. 5 (2020): 1117–22. http://dx.doi.org/10.1093/jac/dkz570.

Full text

Abstract:

Abstract Objectives The genetic prediction of phenotypic antibiotic resistance based on analysis of WGS data is becoming increasingly feasible, but a major barrier to its introduction into routine use is the lack of fully automated interpretation tools. Here, we report the findings of a large evaluation of the Next Gen Diagnostics (NGD) automated bioinformatics analysis tool to predict the phenotypic resistance of MRSA. Methods MRSA-positive patients were identified in a clinical microbiology laboratory in England between January and November 2018. One MRSA isolate per patient together with all blood culture isolates (total n = 778) were sequenced on the Illumina MiniSeq instrument in batches of 21 clinical MRSA isolates and three controls. Results The NGD system activated post-sequencing and processed the sequences to determine susceptible/resistant predictions for 11 antibiotics, taking around 11 minutes to analyse 24 isolates sequenced on a single sequencing run. NGD results were compared with phenotypic susceptibility testing performed by the clinical laboratory using the disc diffusion method and EUCAST breakpoints. Following retesting of discrepant results, concordance between phenotypic results and NGD genetic predictions was 99.69%. Further investigation of 22 isolate genomes associated with persistent discrepancies revealed a range of reasons in 12 cases, but no cause could be found for the remainder. Genetic predictions generated by the NGD tool were compared with predictions generated by an independent research-based informatics approach, which demonstrated an overall concordance between the two methods of 99.97%. Conclusions We conclude that the NGD system provides rapid and accurate prediction of the antibiotic susceptibility of MRSA.

APA, Harvard, Vancouver, ISO, and other styles

24

Monger, Steven, Michael Troup, Eddie Ip, Sally L. Dunwoodie, and Eleni Giannoulatou. "Spliceogen: an integrative, scalable tool for the discovery of splice-altering variants." Bioinformatics 35, no. 21 (2019): 4405–7. http://dx.doi.org/10.1093/bioinformatics/btz263.

Full text

Abstract:

Abstract Motivation In silico prediction tools are essential for identifying variants which create or disrupt cis-splicing motifs. However, there are limited options for genome-scale discovery of splice-altering variants. Results We have developed Spliceogen, a highly scalable pipeline integrating predictions from some of the individually best performing models for splice motif prediction: MaxEntScan, GeneSplicer, ESRseq and Branchpointer. Availability and implementation Spliceogen is available as a command line tool which accepts VCF/BED inputs and handles both single nucleotide variants (SNVs) and indels (https://github.com/VCCRI/Spliceogen). SNV databases with prediction scores are also available, covering all possible SNVs at all genomic positions within all Gencode-annotated multi-exon transcripts. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

25

Bertelli, Claire, and Fiona S. L. Brinkman. "Improved genomic island predictions with IslandPath-DIMOB." Bioinformatics 34, no. 13 (2018): 2161–67. http://dx.doi.org/10.1093/bioinformatics/bty095.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Loher, Phillipe, and Isidore Rigoutsos. "Interactive exploration of RNA22 microRNA target predictions." Bioinformatics 28, no. 24 (2012): 3322–23. http://dx.doi.org/10.1093/bioinformatics/bts615.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Larsson, P., M. J. Skwark, B. Wallner, and A. Elofsson. "Improved predictions by Pcons.net using multiple templates." Bioinformatics 27, no. 3 (2010): 426–27. http://dx.doi.org/10.1093/bioinformatics/btq664.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Pavlovic, V., A. Garg, and S. Kasif. "A Bayesian framework for combining gene predictions." Bioinformatics 18, no. 1 (2002): 19–27. http://dx.doi.org/10.1093/bioinformatics/18.1.19.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Michel, Mirco, Sikander Hayat, Marcin J. Skwark, Chris Sander, Debora S. Marks, and Arne Elofsson. "PconsFold: improved contact predictions improve protein models." Bioinformatics 30, no. 17 (2014): i482—i488. http://dx.doi.org/10.1093/bioinformatics/btu458.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Simkovic, Felix, Jens M. H. Thomas, and Daniel J. Rigden. "ConKit: a python interface to contact predictions." Bioinformatics 33, no. 14 (2017): 2209–11. http://dx.doi.org/10.1093/bioinformatics/btx148.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Schneider, Melanie, Jean-Luc Pons, William Bourguet, and Gilles Labesse. "Towards accurate high-throughput ligand affinity prediction by exploiting structural ensembles, docking metrics and ligand similarity." Bioinformatics 36, no. 1 (2019): 160–68. http://dx.doi.org/10.1093/bioinformatics/btz538.

Full text

Abstract:

Abstract Motivation Nowadays, virtual screening (VS) plays a major role in the process of drug development. Nonetheless, an accurate estimation of binding affinities, which is crucial at all stages, is not trivial and may require target-specific fine-tuning. Furthermore, drug design also requires improved predictions for putative secondary targets among which is Estrogen Receptor alpha (ERα). Results VS based on combinations of Structure-Based VS (SBVS) and Ligand-Based VS (LBVS) is gaining momentum to improve VS performances. In this study, we propose an integrated approach using ligand docking on multiple structural ensembles to reflect receptor flexibility. Then, we investigate the impact of the two different types of features (structure-based and ligand molecular descriptors) on affinity predictions using a random forest algorithm. We find that ligand-based features have lower predictive power (rP = 0.69, R2 = 0.47) than structure-based features (rP = 0.78, R2 = 0.60). Their combination maintains high accuracy (rP = 0.73, R2 = 0.50) on the internal test set, but it shows superior robustness on external datasets. Further improvement and extending the training dataset to include xenobiotics, leads to a novel high-throughput affinity prediction method for ERα ligands (rP = 0.85, R2 = 0.71). The presented prediction tool is provided to the community as a dedicated satellite of the @TOME server in which one can upload a ligand dataset in mol2 format and get ligand docked and affinity predicted. Availability and implementation http://edmon.cbs.cnrs.fr. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

32

Shah, N. H. "Translational Bioinformatics Embraces Big Data." Yearbook of Medical Informatics 21, no. 01 (2012): 130–34. http://dx.doi.org/10.1055/s-0038-1639443.

Full text

Abstract:

SummaryWe review the latest trends and major developments in translational bioinformatics in the year 2011-2012. Our emphasis is on highlighting the key events in the field and pointing at promising research areas for the future. The key take-home points are:• Translational informatics is ready to revolutionize human health and healthcare using large-scale measurements on individuals.• Data–centric approaches that compute on massive amounts of data (often called “Big Data”) to discover patterns and to make clinically relevant predictions will gain adoption.• Research that bridges the latest multimodal measurement technologies with large amounts of electronic healthcare data is increasing; and is where new breakthroughs will occur.

APA, Harvard, Vancouver, ISO, and other styles

33

Barot, Meet, Vladimir Gligorijević, Kyunghyun Cho, and Richard Bonneau. "NetQuilt: deep multispecies network-based protein function prediction using homology-informed network similarity." Bioinformatics 37, no. 16 (2021): 2414–22. http://dx.doi.org/10.1093/bioinformatics/btab098.

Full text

Abstract:

Abstract Motivation Transferring knowledge between species is challenging: different species contain distinct proteomes and cellular architectures, which cause their proteins to carry out different functions via different interaction networks. Many approaches to protein functional annotation use sequence similarity to transfer knowledge between species. These approaches cannot produce accurate predictions for proteins without homologues of known function, as many functions require cellular context for meaningful prediction. To supply this context, network-based methods use protein-protein interaction (PPI) networks as a source of information for inferring protein function and have demonstrated promising results in function prediction. However, most of these methods are tied to a network for a single species, and many species lack biological networks. Results In this work, we integrate sequence and network information across multiple species by computing IsoRank similarity scores to create a meta-network profile of the proteins of multiple species. We use this integrated multispecies meta-network as input to train a maxout neural network with Gene Ontology terms as target labels. Our multispecies approach takes advantage of more training examples, and consequently leads to significant improvements in function prediction performance compared to two network-based methods, a deep learning sequence-based method and the BLAST annotation method used in the Critial Assessment of Functional Annotation. We are able to demonstrate that our approach performs well even in cases where a species has no network information available: when an organism’s PPI network is left out we can use our multi-species method to make predictions for the left-out organism with good performance. Availability and implementation The code is freely available at https://github.com/nowittynamesleft/NetQuilt. The data, including sequences, PPI networks and GO annotations are available at https://string-db.org/. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

34

Lundegaard, Claus, Ole Lund, Can Keşmir, Søren Brunak, and Morten Nielsen. "Modeling the adaptive immune system: predictions and simulations." Bioinformatics 23, no. 24 (2007): 3265–75. http://dx.doi.org/10.1093/bioinformatics/btm471.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Murakami, Y., and S. Jones. "SHARP2: protein-protein interaction predictions using patch analysis." Bioinformatics 22, no. 14 (2006): 1794–95. http://dx.doi.org/10.1093/bioinformatics/btl171.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Tolstorukov, M. Y., V. Choudhary, W. K. Olson, V. B. Zhurkin, and P. J. Park. "nuScore: a web-interface for nucleosome positioning predictions." Bioinformatics 24, no. 12 (2008): 1456–58. http://dx.doi.org/10.1093/bioinformatics/btn212.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Michel, Mirco, David Menéndez Hurtado, and Arne Elofsson. "PconsC4: fast, accurate and hassle-free contact predictions." Bioinformatics 35, no. 15 (2018): 2677–79. http://dx.doi.org/10.1093/bioinformatics/bty1036.

Full text

Abstract:

Abstract Motivation Residue contact prediction was revolutionized recently by the introduction of direct coupling analysis (DCA). Further improvements, in particular for small families, have been obtained by the combination of DCA and deep learning methods. However, existing deep learning contact prediction methods often rely on a number of external programs and are therefore computationally expensive. Results Here, we introduce a novel contact predictor, PconsC4, which performs on par with state of the art methods. PconsC4 is heavily optimized, does not use any external programs and therefore is significantly faster and easier to use than other methods. Availability and implementation PconsC4 is freely available under the GPL license from https://github.com/ElofssonLab/PconsC4. Installation is easy using the pip command and works on any system with Python 3.5 or later and a GCC compiler. It does not require a GPU nor special hardware. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

38

Warwick Vesztrocy, Alex, and Christophe Dessimoz. "Benchmarking gene ontology function predictions using negative annotations." Bioinformatics 36, Supplement_1 (2020): i210—i218. http://dx.doi.org/10.1093/bioinformatics/btaa466.

Full text

Abstract:

Abstract Motivation With the ever-increasing number and diversity of sequenced species, the challenge to characterize genes with functional information is even more important. In most species, this characterization almost entirely relies on automated electronic methods. As such, it is critical to benchmark the various methods. The Critical Assessment of protein Function Annotation algorithms (CAFA) series of community experiments provide the most comprehensive benchmark, with a time-delayed analysis leveraging newly curated experimentally supported annotations. However, the definition of a false positive in CAFA has not fully accounted for the open world assumption (OWA), leading to a systematic underestimation of precision. The main reason for this limitation is the relative paucity of negative experimental annotations. Results This article introduces a new, OWA-compliant, benchmark based on a balanced test set of positive and negative annotations. The negative annotations are derived from expert-curated annotations of protein families on phylogenetic trees. This approach results in a large increase in the average information content of negative annotations. The benchmark has been tested using the naïve and BLAST baseline methods, as well as two orthology-based methods. This new benchmark could complement existing ones in future CAFA experiments. Availability and Implementation All data, as well as code used for analysis, is available from https://lab.dessimoz.org/20_not. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

39

Drăgan, Monica-Andreea, Ismail Moghul, Anurag Priyam, Claudio Bustos, and Yannick Wurm. "GeneValidator: identify problems with protein-coding gene predictions." Bioinformatics 32, no. 10 (2016): 1559–61. http://dx.doi.org/10.1093/bioinformatics/btw015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Zhou, Mengshi, Chunlei Zheng, and Rong Xu. "Combining phenome-driven drug-target interaction prediction with patients’ electronic health records-based clinical corroboration toward drug discovery." Bioinformatics 36, Supplement_1 (2020): i436—i444. http://dx.doi.org/10.1093/bioinformatics/btaa451.

Full text

Abstract:

Abstract Motivation Predicting drug–target interactions (DTIs) using human phenotypic data have the potential in eliminating the translational gap between animal experiments and clinical outcomes in humans. One challenge in human phenome-driven DTI predictions is integrating and modeling diverse drug and disease phenotypic relationships. Leveraging large amounts of clinical observed phenotypes of drugs and diseases and electronic health records (EHRs) of 72 million patients, we developed a novel integrated computational drug discovery approach by seamlessly combining DTI prediction and clinical corroboration. Results We developed a network-based DTI prediction system (TargetPredict) by modeling 855 904 phenotypic and genetic relationships among 1430 drugs, 4251 side effects, 1059 diseases and 17 860 genes. We systematically evaluated TargetPredict in de novo cross-validation and compared it to a state-of-the-art phenome-driven DTI prediction approach. We applied TargetPredict in identifying novel repositioned candidate drugs for Alzheimer’s disease (AD), a disease affecting over 5.8 million people in the United States. We evaluated the clinical efficiency of top repositioned drug candidates using EHRs of over 72 million patients. The area under the receiver operating characteristic (ROC) curve was 0.97 in the de novo cross-validation when evaluated using 910 drugs. TargetPredict outperformed a state-of-the-art phenome-driven DTI prediction system as measured by precision–recall curves [measured by average precision (MAP): 0.28 versus 0.23, P-value &lt; 0.0001]. The EHR-based case–control studies identified that the prescriptions top-ranked repositioned drugs are significantly associated with lower odds of AD diagnosis. For example, we showed that the prescription of liraglutide, a type 2 diabetes drug, is significantly associated with decreased risk of AD diagnosis [adjusted odds ratios (AORs): 0.76; 95% confidence intervals (CI) (0.70, 0.82), P-value &lt; 0.0001]. In summary, our integrated approach that seamlessly combines computational DTI prediction and large-scale patients’ EHRs-based clinical corroboration has high potential in rapidly identifying novel drug targets and drug candidates for complex diseases. Availability and implementation nlp.case.edu/public/data/TargetPredict.

APA, Harvard, Vancouver, ISO, and other styles

41

Ryu, Jae Yong, Mi Young Lee, Jeong Hyun Lee, Byung Ho Lee, and Kwang-Seok Oh. "DeepHIT: a deep learning framework for prediction of hERG-induced cardiotoxicity." Bioinformatics 36, no. 10 (2020): 3049–55. http://dx.doi.org/10.1093/bioinformatics/btaa075.

Full text

Abstract:

Abstract Motivation Blockade of the human ether-à-go-go-related gene (hERG) channel by small compounds causes a prolonged QT interval that can lead to severe cardiotoxicity and is a major cause of the many failures in drug development. Thus, evaluating the hERG-blocking activity of small compounds is important for successful drug development. To this end, various computational prediction tools have been developed, but their prediction performances in terms of sensitivity and negative predictive value (NPV) need to be improved to reduce false negative predictions. Results We propose a computational framework, DeepHIT, which predicts hERG blockers and non-blockers for input compounds. For the development of DeepHIT, we generated a large-scale gold-standard dataset, which includes 6632 hERG blockers and 7808 hERG non-blockers. DeepHIT is designed to contain three deep learning models to improve sensitivity and NPV, which, in turn, produce fewer false negative predictions. DeepHIT outperforms currently available tools in terms of accuracy (0.773), MCC (0.476), sensitivity (0.833) and NPV (0.643) on an external test dataset. We also developed an in silico chemical transformation module that generates virtual compounds from a seed compound, based on the known chemical transformation patterns. As a proof-of-concept study, we identified novel urotensin II receptor (UT) antagonists without hERG-blocking activity derived from a seed compound of a previously reported UT antagonist (KR-36676) with a strong hERG-blocking activity. In summary, DeepHIT will serve as a useful tool to predict hERG-induced cardiotoxicity of small compounds in the early stages of drug discovery and development. Availability and implementation https://bitbucket.org/krictai/deephit and https://bitbucket.org/krictai/chemtrans Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

42

Ke, Yaobin, Jiahua Rao, Huiying Zhao, Yutong Lu, Nong Xiao, and Yuedong Yang. "Accurate prediction of genome-wide RNA secondary structure profile based on extreme gradient boosting." Bioinformatics 36, no. 17 (2020): 4576–82. http://dx.doi.org/10.1093/bioinformatics/btaa534.

Full text

Abstract:

Abstract Motivation RNA secondary structure plays a vital role in fundamental cellular processes, and identification of RNA secondary structure is a key step to understand RNA functions. Recently, a few experimental methods were developed to profile genome-wide RNA secondary structure, i.e. the pairing probability of each nucleotide, through high-throughput sequencing techniques. However, these high-throughput methods have low precision and cannot cover all nucleotides due to limited sequencing coverage. Results Here, we have developed a new method for the prediction of genome-wide RNA secondary structure profile from RNA sequence based on the extreme gradient boosting technique. The method achieves predictions with areas under the receiver operating characteristic curve (AUC) &gt;0.9 on three different datasets, and AUC of 0.888 by another independent test on the recently released Zika virus data. These AUCs are consistently &gt;5% greater than those by the CROSS method recently developed based on a shallow neural network. Further analysis on the 1000 Genome Project data showed that our predicted unpaired probabilities are highly correlated (&gt;0.8) with the minor allele frequencies at synonymous, non-synonymous mutations, and mutations in untranslated regions, which were higher than those generated by RNAplfold. Moreover, the prediction over all human mRNA indicated a consistent result with previous observation that there is a periodic distribution of unpaired probability on codons. The accurate predictions by our method indicate that such model trained on genome-wide experimental data might be an alternative for analytical methods. Availability and implementation The GRASP is available for academic use at https://github.com/sysu-yanglab/GRASP. Supplementary information Supplementary data are available online.

APA, Harvard, Vancouver, ISO, and other styles

43

Chen, Muhao, Chelsea J. T. Ju, Guangyu Zhou, et al. "Multifaceted protein–protein interaction prediction based on Siamese residual RCNN." Bioinformatics 35, no. 14 (2019): i305—i314. http://dx.doi.org/10.1093/bioinformatics/btz328.

Full text

Abstract:

AbstractMotivationSequence-based protein–protein interaction (PPI) prediction represents a fundamental computational biology problem. To address this problem, extensive research efforts have been made to extract predefined features from the sequences. Based on these features, statistical algorithms are learned to classify the PPIs. However, such explicit features are usually costly to extract, and typically have limited coverage on the PPI information.ResultsWe present an end-to-end framework, PIPR (Protein–Protein Interaction Prediction Based on Siamese Residual RCNN), for PPI predictions using only the protein sequences. PIPR incorporates a deep residual recurrent convolutional neural network in the Siamese architecture, which leverages both robust local features and contextualized information, which are significant for capturing the mutual influence of proteins sequences. PIPR relieves the data pre-processing efforts that are required by other systems, and generalizes well to different application scenarios. Experimental evaluations show that PIPR outperforms various state-of-the-art systems on the binary PPI prediction problem. Moreover, it shows a promising performance on more challenging problems of interaction type prediction and binding affinity estimation, where existing approaches fall short.Availability and implementationThe implementation is available at https://github.com/muhaochen/seq_ppi.git.Supplementary informationSupplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

44

Devkota, Kapil, James M. Murphy, and Lenore J. Cowen. "GLIDE: combining local methods and diffusion state embeddings to predict missing interactions in biological networks." Bioinformatics 36, Supplement_1 (2020): i464—i473. http://dx.doi.org/10.1093/bioinformatics/btaa459.

Full text

Abstract:

Abstract Motivation One of the core problems in the analysis of biological networks is the link prediction problem. In particular, existing interactions networks are noisy and incomplete snapshots of the true network, with many true links missing because those interactions have not yet been experimentally observed. Methods to predict missing links have been more extensively studied for social than for biological networks; it was recently argued that there is some special structure in protein–protein interaction (PPI) network data that might mean that alternate methods may outperform the best methods for social networks. Based on a generalization of the diffusion state distance, we design a new embedding-based link prediction method called global and local integrated diffusion embedding (GLIDE). GLIDE is designed to effectively capture global network structure, combined with alternative network type-specific customized measures that capture local network structure. We test GLIDE on a collection of three recently curated human biological networks derived from the 2016 DREAM disease module identification challenge as well as a classical version of the yeast PPI network in rigorous cross validation experiments. Results We indeed find that different local network structure is dominant in different types of biological networks. We find that the simple local network measures are dominant in the highly connected network core between hub genes, but that GLIDE’s global embedding measure adds value in the rest of the network. For example, we make GLIDE-based link predictions from genes known to be involved in Crohn’s disease, to genes that are not known to have an association, and make some new predictions, finding support in other network data and the literature. Availability and implementation GLIDE can be downloaded at https://bitbucket.org/kap_devkota/glide. Supplementary information Supplementary data are available at Bioinformatics online.

APA, Harvard, Vancouver, ISO, and other styles

45

Rogers, Julia R., Sean M. McHugh та Yu-Shan Lin. "Predictions for α-Helical Glycopeptide Design from Structural Bioinformatics Analysis". Journal of Chemical Information and Modeling 57, № 10 (2017): 2598–611. http://dx.doi.org/10.1021/acs.jcim.7b00123.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Pokkuluri, Kiran Sree, Ramesh Babu Inampudi, and S. S. S. N. Usha Devi Nedunuri. "IN-MACA-MCC: Integrated Multiple Attractor Cellular Automata with Modified Clonal Classifier for Human Protein Coding and Promoter Prediction." Advances in Bioinformatics 2014 (July 15, 2014): 1–7. http://dx.doi.org/10.1155/2014/261362.

Full text

Abstract:

Protein coding and promoter region predictions are very important challenges of bioinformatics (Attwood and Teresa, 2000). The identification of these regions plays a crucial role in understanding the genes. Many novel computational and mathematical methods are introduced as well as existing methods that are getting refined for predicting both of the regions separately; still there is a scope for improvement. We propose a classifier that is built with MACA (multiple attractor cellular automata) and MCC (modified clonal classifier) to predict both regions with a single classifier. The proposed classifier is trained and tested with Fickett and Tung (1992) datasets for protein coding region prediction for DNA sequences of lengths 54, 108, and 162. This classifier is trained and tested with MMCRI datasets for protein coding region prediction for DNA sequences of lengths 252 and 354. The proposed classifier is trained and tested with promoter sequences from DBTSS (Yamashita et al., 2006) dataset and nonpromoters from EID (Saxonov et al., 2000) and UTRdb (Pesole et al., 2002) datasets. The proposed model can predict both regions with an average accuracy of 90.5% for promoter and 89.6% for protein coding region predictions. The specificity and sensitivity values of promoter and protein coding region predictions are 0.89 and 0.92, respectively.

APA, Harvard, Vancouver, ISO, and other styles

47

Schlessinger, Avner, Marco Punta, and Burkhard Rost. "Natively unstructured regions in proteins identified from contact predictions." Bioinformatics 23, no. 18 (2007): 2376–84. http://dx.doi.org/10.1093/bioinformatics/btm349.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Sugden, Lauren A., Michael R. Tackett, Yiannis A. Savva, William A. Thompson, and Charles E. Lawrence. "Assessing the validity and reproducibility of genome-scale predictions." Bioinformatics 29, no. 22 (2013): 2844–51. http://dx.doi.org/10.1093/bioinformatics/btt508.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

King, R. D., P. H. Wise, and A. Clare. "Confirmation of data mining based predictions of protein function." Bioinformatics 20, no. 7 (2004): 1110–18. http://dx.doi.org/10.1093/bioinformatics/bth047.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Ellis, Lynda B. M., and Robert P. Milius. "Valid and invalid implementations of GOR secondary structure predictions." Bioinformatics 10, no. 3 (1994): 341–48. http://dx.doi.org/10.1093/bioinformatics/10.3.341.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!