Dissertations / Theses on the topic 'Subcellular localization prediction'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 20 dissertations / theses for your research on the topic 'Subcellular localization prediction.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Ozarar, Mert. "Prediction Of Protein Subcellular Localization Based On Primary Sequence Data." Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/1082320/index.pdf.
Full textBozkurt, Burcin. "Prediction Of Protein Subcellular Localization Using Global Protein Sequence Feature." Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/3/1135292/index.pdf.
Full textand self-organizing maps (SOM)"
are used. For classication purposes, support vector machines (SVM)"
, a method of statistical learning theory recently introduced to bioinformatics is used. The aim of the combination of feature extraction, clustering and classification methods is to design an acccurate system that predicts the subcellular localization of proteins presented into the system. Our scheme for combining several methods is cascading or serial combination according to its architecture. In the cascading architecture, the output of a method serves as the input of the other model used.
Scott, Michelle. "Protein subcellular localization : analysis and prediction using the endoplasmic reticulum as a model organelle." Thesis, McGill University, 2005. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=102170.
Full textTo facilitate the investigation of the endoplasmic reticulum (ER) and more generally, the secretory pathway, we have created Hera, a publicly accessible protein localization database. Originally designed to house characteristics of ER proteins, it currently contains tens of thousands of proteins from different organisms and subcellular compartments. Hera was originally used to investigate features of ER proteins, providing insight into the extent of usage of various localization mechanisms, including both well-studied but also non-classical and novel mechanisms.
Hera was subsequently used to create Bayesian network type localization predictors. By considering the combinatorial presence of motifs, domains, targeting signals and using in some cases, protein interaction information, our predictors achieve high accuracy and coverage. When our predictions are compared with localization annotations from high-throughput studies in both human and yeast, we find that disagreements mainly involve proteins in the secretory pathway. Our predictors can be used to independently validate these large-scale studies. We further refined the localization prediction of the whole yeast proteome by distinguishing proteins localized to the lumen or membrane of various organelles from cytosolic proteins peripherally associated with these organelles.
Hera was also used to investigate efficient and informative approaches to interrogate interaction networks in order to gain insight into the relationship between proteins/genes of interest. By combining interaction and refined localization information, we constructed localizome-interactome networks of whole organelles. Such models provide insight into global organellar characteristics and inter-organellar mechanisms of communication.
The research presented in this thesis demonstrates that the integration, in an appropriate framework such as Bayesian networks, of widely available information such as localization and interaction data allows to gain deep insights into cellular processes.
Zhu, Lu [Verfasser]. "Context-specific subcellular localization prediction: Leveraging protein interaction networks and scientific texts / Lu Zhu." Bielefeld : Universitätsbibliothek Bielefeld, 2018. http://d-nb.info/1169314589/34.
Full textFagerberg, Linn. "Mapping the human proteome using bioinformatic methods." Doctoral thesis, KTH, Proteomik, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-31477.
Full textQC 20110317
The Human Protein Atlas project
Yu, Chin-Sheng, and 游景盛. "Prediction of Protein Subcellular Localization." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/28216444510128135886.
Full text國立交通大學
生物科技系所
95
Since the protein's function is usually related to its subcellular localization, the ability to predict subcellular localization directly from protein sequences will be useful to biologists to infer protein function. Recent years we have seen a surging interest in the development of novel computational tools to predict subcellular localization. With the rapid increase of sequenced genomic data, the need for an automated and accurate tool to predict subcellular localization becomes increasingly important. At present, these approaches, based on a wide range of algorithms, have achieved varying degrees of success for specific organisms and for certain localization categories. In this thesis, I used support vector machine (SVM) method based on n–peptide composition in predicting the subcellular locations of proteins. For an unbiased assessment of the results, we apply our approach to several independent data sets in the beginning. In those data sets, our approach gives superior performance compared with other approaches. A number of authors have noticed that sequence similarity is useful in predicting subcellular localization. For example, Rost and Nair (Protein Sci, 11:2836-47 (2002)) have carried out extensive analysis of the relation between sequence similarity and identity in subcellular localization and found a close relationship between them above a certain similarity threshold. However, many existing benchmark data sets used for the prediction accuracy assessment contain highly homologous sequences – some data sets comprising sequences up to 80-90% sequence identity. Using these benchmark test data will surely lead to overestimation of the performance of the methods considered. Here, we developed an approach based on a two-level SVM system: the first level comprises a number of SVM classifiers, each based on a specific type of feature vectors derived from sequences; the second level SVM classifier functions as the jury machine to generate the probability distribution of decisions for possible localizations. We compare our approach with a global sequence alignment approach and other existing approaches for two iii often-used benchmark data sets – one comprising prokaryotic sequences and the other eukaryotic sequences. Furthermore, we carried out all-against-all sequence alignment for several data sets to check the relationship between sequence homology and localization. Our results, which are consistent with previous studies, indicate that the homology search approach performs surprisingly well for sequences sharing homology as low as 30%, but its performance deteriorates considerably for sequences sharing lower sequence identity. A data set of high homology levels will obviously lead to biased assessment of the performances of the predictive approaches - especially those relying on homology search or sequence annotations. Since our two-level classification system based on SVM does not rely on homology search, its performance remains relatively unaffected by sequence homology. When compared with other approaches, our approach outperformed other existing approaches, even though some of which use homology search as part of their algorithms. Furthermore, for the practical purpose, we also develop a practical hybrid method that pipelines the two-level SVM classifier and the homology search method in sequential order as a general tool for the sequence annotation of subcellular localization. Our approaches should be valuable in the high throughput analysis of genomics and proteomics.
Syu, Shiao-shan, and 徐筱姍. "Human Protein Subcellular Localization Prediction." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/96176482574886082780.
Full text逢甲大學
生醫資訊暨生醫工程碩士學位學程
99
The biological function of a protein in a cell is often closely correlated with its subcellular localization. Hence, the information about where a protein localized often offers important clues toward knowing the function of an uncharacterized sequence. The protein subcellular localization can be used as an important feature to screen for drug candidates, vaccine design, and gene products annotation. Here, We applied the support vector machine algorithm to a benchmark dataset of human protein sequence based on n-peptide composition. The first step of this method is that we classify the protein sequence by different feature then use SVM to predict subcellular localization. The second step, we use the result of the first step to predict again by the support vector machine classifier.We use PSLT training Hera data set, this data set is include 2233 human protein sequence and 9 subcellular localizations inside of cell. Our method achieves an overall classification accuracy of 80% as estimated by using a 10-fold cross-validation test with coverage of 74%. For the rest 26%, our method achieves an overall classification the accuracy of 45%. This research should provide an important tool in human genomics and proteomics studies.
Chen, Shu-Pin, and 陳書品. "Prediction of eukaryotic protein subcellular localization." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/10932435428409959975.
Full text國立中央大學
資訊工程研究所
96
Prediction of subcellular localization of various proteins is an important and well-studied problem. Each compartment in cell has specific tasks, and proteins in each compartment are synthesized to fulfill these tasks. Proteins localized in the same compartment are thought to have the same or similar function. Knowledge of the subcellular localization of a protein can significantly improve target identification during the drug discovery process. Current available methods extract information from amino acid sequence or signal peptide and lack more biological features like post-translational modification. We develop an integrated system for biologists to know which localization the proteins from eukaryote is located to. The system is based on protein sequence composition, signal peptide, protein domains from Pfam and homologs search.
Chen, Shih-Hao, and 陳世豪. "Subcellular Localization Prediction of Eukaryotic Protein." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/80756466635576715069.
Full text臺中健康暨管理學院
生物資訊研究所
92
Biologically, the function of a protein is highly related to its subcellular localization. Accordingly, it is necessary to develop an automatic yet reliable method for protein subcellular localization prediction, especially when large-scale genome sequences are to be analyzed. Various methods have been proposed to perform the task. The results, however, are not satisfactory in terms of effectiveness and efficiency. In this paper, the proposed Bayesian inference method and The Information Gain used to observed important information, Moreover, the Nearest Neighbor Classification is considerably effective for subcellular localization prediction in a supervised fashion.
Chen, Yu-Tzu, and 陳佑慈. "Protein-protein interaction prediction enhancement using subcellular localization." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/81806002826018277394.
Full text國立中央大學
資訊工程研究所
98
Protein–protein interactions are importance for almost every process in living cell. Abnormal interactions may have implications in a number of neurological syndromes. Therefore, it is crucial to recognize the association and dissociation of protein molecules. Current available computational methods of prediction of protein–protein interaction extract information from amino acid sequence or signal peptide. There are few method consider subcellular localization information. The method presented in this paper is based on the assumption that two proteins should appear on same subcellular localization to perform interaction. We develop an integrated system which based on a learning algorithm-support vector machine to predict protein–protein interactions. We construct training models for different subcellular localization. Each test protein pair request one training model to predict according to its localization. This method is take protein sequence composition, protein domains and subcellular localization information as features. The prediction ability of our method is better than other sequence-based protein–protein interaction prediction methods. In addition, a more complete data of protein-protein interactions and subcellular localizations can enhance the prediction ability of the method.
Su, Chia-Yu, and 蘇家玉. "Prediction of Subcellular Localization and RNA-binding Sites in Proteins." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/43820761805319359116.
Full text國立交通大學
生物資訊研究所
97
Automated function annotation is a major goal of post-genomic era with tremendous amount of protein sequences in the databases. Prediction of subcellular localization or binding sites in proteins is crucial for function analysis, genome annotation, and drug discovery. Determination of localization or structure using experimental approaches is time-consuming; thus, computational approaches become highly desirable. We proposed two protein subcellular localization prediction methods, PSL101 and PSLDoc. PSL101 combines a structural homology approach and a support vector machine model, in which compartment-specific biological features derived from bacterial translocation pathways are incorporated. PSLDoc uses a probabilistic latent semantic analysis on gapped-dipeptides of various distances, where evolutionary information from position specific scoring matrix (PSSM) is utilized. Our methods achieve 93% in overall accuracy for Gram-negative bacteria, and compared favorably to the state-of-the-art results by 7.4% on a benchmark dataset having low homology to the training set. Experiment results demonstrate that both biological features derived from translocation pathways and feature reduction by document classification techniques can lead to a significant improvement in the prediction performance. Moreover, the proposed biological features and gapped-dipeptide signatures are interpretable and can be applied in advanced studies and experiment designs. For RNA-binding site prediction, we propose another method, RNAProB, which incorporates a new smoothed PSSM encoding scheme in a support vector machine model. The proposed smoothed PSSM encoding considers correlation and dependency from neighboring residues for each amino acid in a protein sequence. Experiment results show that smoothed PSSM encoding significantly enhances the prediction performance, especially for sensitivity. Our method performs better than the state-of-the-art systems by 4.90%~6.83%, 7.05%~26.90%, 0.88%~5.33%, and 0.10~0.23 in terms of overall accuracy, sensitivity, specificity, and Matthew’s correlation coefficient, respectively. This also supports our assumption that smoothed PSSM encoding can better resolve the ambiguity in discriminating between interacting and non-interacting residues by modeling the dependency from surrounding residues. Because of the generality of the proposed methods, they can be extended to other research topics in the future. Moreover, the information from predicted localization and structure of proteins can be used collectively to assist biologists in both inferring protein function and finding suitable drug targets. Therefore, we believe that our work can contribute to scientific discoveries on a high-throughput basis.
Gaston, Daniel. "PHYLOGENOMIC APPROACHES TO THE ANALYSIS OF FUNCTIONAL DIVERGENCE AND SUBCELLULAR LOCALIZATION." 2012. http://hdl.handle.net/10222/14439.
Full textYen, Shou-Cheng, and 嚴守正. "Towards Improving Accuracy of Subcellular Localization Prediction for Lysosomal, Peroxisomal and ER Proteins." Thesis, 2011. http://ndltd.ncl.edu.tw/handle/22770069206338505923.
Full text國立陽明大學
生物醫學資訊研究所
99
Within a cell biological functions are often localized in specific subcellular compartments. Hence the ability to predicting the subcellular localizations for uncharacterized proteins is critical for protein functional annotation. This study describes a novel method for identifying sequence motifs to predict protein subcellular localizations. Existing methods mostly rely on prior knowledge about protein targeting signals and sophisticated residue compositions that provide obscure insights about cellular functions. Here we proposed a systematic approach to identify signature motifs without using prior knowledge. The attention was placed on the localizations that are traditionally more difficult to predict, i.e., the lysosomal, peroxisomal and ER proteins. For proteins within those localizations, we investigated all sequence motifs (length < 8) represented by a reduced amino acid alphabet set. Each motif was then subject to a statistical test to determine if it has a distinct occurrence frequency for proteins in the specific localization. The identified sequence motifs were further extended on both ends to increase their length. Three of the motifs have never been reported in the field of localization prediction, they are: (1) the [WFY][AVLI][AVLI]KNS[WFY] motif, a lysosomal specific motif found on cathepsin protease active site; (2) the RERIPERVVHA motif exclusive for peroxisomal proteins; and (3) the enriched CGHC motif present exclusively in ER proteins; The results facilitate our in-house implementation of a more accurate prediction tool for lysosomal, peroxisomal and ER proteins, the three most challenging localizations. We propose a prediction system using existing approaches and correct the mis-identified using our novel motifs and existing motifs. The result shows that our prediction performs more accurate performance for predict ER, lysosomal and peroxisome proteins with the MCC 0.61, 0.63, and 0.53, respectively. With extension of proteins located in other subcellular compartments using a wider range of physicochemical properties, our discovery-oriented approach fulfills the gaps left by the current studies in this field.
Huang, Wen-Lin, and 黃文玲. "Using Gene Ontology Annotation and Physicochemical Properties for Prediction of Protein Subcellular and Subnuclear Localization." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/09303799954362845008.
Full text逢甲大學
資訊工程所
96
Eukaryotic cells consist of some major parts, the nucleus, cytoplasm, Mitochondrion, Extracellular, and Chloroplast. One of the fundamental goals in molecular cell biology and proteomics is to identify their subcellular locations or environments because the function of a protein and its role in a cell are closely correlated with which compartment or organelle it resides in. The knowledge thus obtained can help us timely utilize these newly found protein sequences for both basic research and drug discovery. Among the subcellualr compartments, the nucleus is a highly complex organelle that forms a package for cells and their corresponding regulatory factors. Therefore, preicition of subcellualr and subnuclear localization are critical problems in biological field. Computational prediction methods from primary protein sequences are fairly economic in terms of identifying many proteins with unknown functions. Accurate prediction methods not only rely on informative features and classifier design but also emphasize in feature section. This dissertation proposes two novel genetic algorithm based algorithms, GOmining and ESVM, for subcellualr and subnucelar localization prediction. The two algorithms combined with support vector machine (SVM) can determine the best number m of n features and identify a small number m out of the n features and determine simultaneously. This dissertation using the GOmining and ESVM proposes two prediciotn systems, ProLoc-GO and ProLoc, by mining informative Gene Ontology (GO) terms and physicochemical composition (PCC) for protein subcellular and subnuclear localization, respectively. To evaluate ProLoc, this study uses two datasets SNL6 and SNL9, which have 504 proteins localized in six subnuclear compartments and 367 proteins localized in nine subnuclear compartments. The ProLoc utilizing the selected mPCC=33 and 28 PCC features has accuracies of 56.37% for SNL6 and 72.82% for SNL9, respectively. As for the ProLoc-GO system, it utilizes GOmining to identify a small number m out of the n GO terms as input features to SVM, where m << n. The m informative GO terms contain the essential GO terms annotating subcellular compartments such as GO:0005634 (Nucleus), GO:0005737 (Cytoplasm) and GO:0005856 (Cytoskeleton). Two existing data sets SCL12 (human protein with 12 locations) and SCL16 (Eukaryotic proteins with 16 locations) with <25% sequence identity are used to evaluate ProLoc-GO which has been implemented by using a single SVM classifier with the m=44 and m=60 informative GO terms, respectively. ProLoc-GO using input sequences yields test accuracies of 88.1% and 83.3% for SCL12 and SCL16, respectively. Since GOmining incoperated with GO is effieient, an improved prediction system NuProLoc by using GOmining is proposed for subnucelar localization prediction. The NuProLoc yields accuracies 75.6% and 82.4% for SNL6 and SNL9, respectively, which significiently better than 56.37% and 72.82% for ProLoc. The growth of Gene Ontology and physicochemical properties in size and popularity has increased the effectiveness of GO-based and PCC-based features. GOmining and ESVM can serve as tools for selecting informative GO terms and PCC features in solving sequence-based prediction problems.
Liao, Jun-Qin, and 廖俊欽. "Protein Subcellular Localization Prediction by Support Vector Machine and Genetic Algorithm based on n-Peptide Compositions." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/4u3ruf.
Full textLi, Wei-Jyun, and 李瑋峻. "Predicting Protein Subcellular Localization Using Integrative System." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/46151991856883970059.
Full text國立臺灣海洋大學
資訊工程學系
97
The prediction of protein subcellular localization (PSL) has become a popular field in recent years because it can help protein function prediction and genome annotation, and thus aid the drug design. However, the experimental methods for analyzing PSL are often expensive and time-consuming tasks. Therefore, the computational prediction of PSL, with the use of information in databases, has become a vibrant field of study. Nevertheless, it is still a tough task to extract suitable features from proteins for accurate prediction of PSL due to the complex structures of proteins. Consequently, for improving prediction performance on PSL problem, several modern PSL prediction systems apply multi-feature based protein descriptors and adopt hybrid complex prediction systems to classify and predict PSL. Even though, these systems possess outstanding prediction performance, few of them provide protein characteristics and bases of classification for further analysis. Therefore, in this thesis, a PSL prediction system, PSL-PR-CPR (Protein Subcellular Localization PredictoR and Characteristic ProvideR), which aims to provide more protein characteristics for analysis, is proposed. In PSL-PR-CPR system, proteins are encoded into feature vectors by using a protein descriptor, AAwindow, which uses Amino Acid Index (AAI) to describe proteins in a simple and easy-understood way. In order to derive a prediction model which has a high prediction performance, PSL-PR-CPR employs MG-PSO-DS, an evolutionary computation algorithm, for doing feature selection to select appropriate feature sets that are suitable for C4.5 classifier to classify and predict PSL. MG-PSO-DS is also applied to optimize C4.5 prediction performance by tuning C4.5 parameters. The PSL-PR-CPR displays C4.5 decision rules and provides protein features that assist protein analysis after constructing the prediction model. In addition, PSL-PR-CPR shows the characteristics of important features within amino acid sequence according to the easy-understood property of AAwindow for the purpose of providing more information for analysis reference. For prediction performance validation, two datasets were applied to compare the prediction performance of PSL-PR-CPR, Mycobacterial PSL predictor, Gpos-PLoc, CELLO and LocateP at the end of this thesis. The two datasets are 852 mycobacterial proteins from the study of Mycobacterial PSL predictor and 452 Gram-positive bacterial proteins from the study of Gpos-PLoc. The 5 fold cross validation and the 10 fold cross validation are used to validate PSL-PR-CPR performance on 852 mycobacterial proteins and 452 Gram-positive bacterial proteins, respectively. PSL-PR-CPR also provides samples of C4.5 decision rules, important features and characteristics within amino acid sequence.
Nathan, Michel. "A multiple site predictor for subcellular localization of fungal proteins." Thesis, 2006. http://spectrum.library.concordia.ca/9050/1/MR20780.pdf.
Full textLIN, TSAI-YU, and 林采妤. "Improvement of Predicting Human Protein Subcellular Localization Through Integrated Machine Learning Methods." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/5949gw.
Full text逢甲大學
資訊工程學系
106
The prediction of protein subcellular locations is an important topic in computational biology research over the past decade. Knowing protein subcellular localization can understand protein function as well as protein-protein interactions. However, relying on experimental methods to identify subcellular locations of proteins is often laborious and expensive, so when using large-scale protein datasets with unknown locations, it is highly desirable to use more efficient computer prediction tools. So far, many methods have been proposed to predict the location of large-scale protein datasets, and statistical machine learning methods have been widely used in model construction. The key step in these predictions is to encode the amino acid sequence as a feature vector. In this paper, we use protein sequences to calculate various n-peptide amino acid composition, and then characterize different n-peptide amino acid composition characteristics using a machine learning approach-Support Vector Machine(SVM) [1] combined genetic algorithm(GA). Then the genetic algorithm is used to select the features. Finally, the prediction results are evaluated by recall, precision and F1 and compared with the past methods. The results show that our method can achieve 64% of the overall F1 value. We use a simpler method to make predictions, we can get results that are about or better than other more complex methods.
Sun, Han-Hao, and 孫翰豪. "REALoc: Reliable and effective methods to assist predicting human protein subcellular localization." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/76013331557236563304.
Full text國立中興大學
基因體暨生物資訊學研究所
101
Protein subcellular localization is an important part of biological research; which could support drug development and explore the function of proteins. Many subcellular localization prediction tools has developed, most of them used the data of eukaryotes or prokaryotes for model training, however, the related predictors for human proteins are rare. We established a system to predict subcellular localization of human proteins with Singleplex and Multiplex, called REALoc. It based on two layers architecture integrated with two different machine learning methods, one-to-one and many-to may. Besides, system included many sequence based features and function based features, such as amino acid composition, surface accessibility. In addition, we developed a series of computing features like weighted sign AAindex, sequence similarity profile and regular-mRMR feature selection for Gene Ontology. 5 folds Cross-validation was performed with iLoc-Hum on training dataset covers 6 location sites (Cell membrane, Cytoplasm, Endoplasmic reticulum/Golgi apparatus, Mitochondrion, Nucleus, secreted), overall absolute true success rate of REALoc is 75.34%, and on testing dataset is 57.14% which performances are about 10% higher than other four prediction systems. Finally, this study discussed the performance of the two decision mechanism of vote and GANN for predicting single location and multiple locations. Furthermore, the relationship between the protein-protein interaction and subcellular localization by using motifs was investigated.
Shen, Yaoqing. "In silico analysis of mitochondrial proteins." Thèse, 2009. http://hdl.handle.net/1866/3766.
Full textThe important role of mitochondria in the eukaryotic cell has long been appreciated, but their exact composition and the biological processes taking place in mitochondria are not yet fully understood. The two main factors that slow down the progress in this field are inefficient recognition and imprecise annotation of mitochondrial proteins. Therefore, we developed a new computational tool, YimLoc, which effectively predicts mitochondrial proteins from genomic sequences. This tool integrates the strengths of existing predictors and yields higher performance than any individual predictor. We applied YimLoc to ~60 fungal genomes in order to address the controversy about the localization of beta oxidation in these organisms. Our results show that in contrast to previous studies, most fungal groups do possess mitochondrial beta oxidation. This work also revealed the diversity of beta oxidation in fungi, which correlates with their utilization of fatty acids as energy and carbon sources. Further, we conducted an investigation of the key component of the mitochondrial beta oxidation pathway, the acyl-CoA dehydrogenase (ACAD). We combined subcellular localization prediction with subfamily classification and phylogenetic inference of ACAD enzymes from 250 species covering all three domains of life. Our study suggests that ACAD genes are an ancient family with innovative evolutionary strategies to generate a large enzyme toolset for utilizing most diverse fatty acids and amino acids. Finally, to enable the prediction of mitochondrial proteins from data beyond genome sequences, we designed the tool TESTLoc that uses expressed sequence tags (ESTs) as input. TESTLoc performs significantly better than known tools. In addition to providing two new tools for subcellular localization designed for different data, our studies demonstrate the power of combining subcellular localization prediction with other in silico analyses to gain insights into the function of mitochondrial proteins. Most importantly, this work proposes clear hypotheses that are easily testable, with great potential for advancing our knowledge of mitochondrial metabolism.