Tsirigos KD, Hennerdal A, Käll L, Elofsson A
Proteomics 12 (14) 2282-2294 [2012-08-00; online 2012-06-12]
For current state-of-the-art methods, the prediction of correct topology of membrane proteins has been reported to be above 80%. However, this performance has only been observed in small and possibly biased data sets obtained from protein structures or biochemical assays. Here, we test a number of topology predictors on an "unseen" set of proteins of known structure and also on four "genome-scale" data sets, including one recent large set of experimentally validated human membrane proteins with glycosylated sites. The set of glycosylated proteins is also used to examine the ability of prediction methods to separate membrane from nonmembrane proteins. The results show that methods utilizing multiple sequence alignments are overall superior to methods that do not. The best performance is obtained by TOPCONS, a consensus method that combines several of the other prediction methods. The best methods to distinguish membrane from nonmembrane proteins belong to the "Phobius" group of predictors. We further observe that the reported high accuracies in the smaller benchmark sets are not quite maintained in larger scale benchmarks. Instead, we estimate the performance of the best prediction methods for eukaryotic membrane proteins to be between 60% and 70%. The low agreement between predictions from different methods questions earlier estimates about the global properties of the membrane proteome. Finally, we suggest a pipeline to estimate these properties using a combination of the best predictors that could be applied in large-scale proteomics studies of membrane proteins.