PON-Sol: prediction of effects of amino acid substitutions on protein solubility.

Yang Y, Niroula A, Shen B, Vihinen M

Bioinformatics 32 (13) 2032-2034 [2016-07-01; online 2016-02-19]

Solubility is one of the fundamental protein properties. It is of great interest because of its relevance to protein expression. Reduced solubility and protein aggregation are also associated with many diseases. We collected from literature the largest experimentally verified solubility affecting amino acid substitution (AAS) dataset and used it to train a predictor called PON-Sol. The predictor can distinguish both solubility decreasing and increasing variants from those not affecting solubility. PON-Sol has normalized correct prediction ratio of 0.491 on cross-validation and 0.432 for independent test set. The performance of the method was compared both to solubility and aggregation predictors and found to be superior. PON-Sol can be used for the prediction of effects of disease-related substitutions, effects on heterologous recombinant protein expression and enhanced crystallizability. One application is to investigate effects of all possible AASs in a protein to aid protein engineering. PON-Sol is freely available at http://structure.bmc.lu.se/PON-Sol The training and test data are available at http://structure.bmc.lu.se/VariBench/ponsol.php mauno.vihinen@med.lu.se Supplementary data are available at Bioinformatics online.

Abhishek Niroula

DDLS Fellow

PubMed 27153720

DOI 10.1093/bioinformatics/btw066

Crossref 10.1093/bioinformatics/btw066

pii: btw066


Publications 9.5.1