Differential Retention of Pfam Domains Contributes to Long-term Evolutionary Trends.

James JE, Nelson PG, Masel J

Mol. Biol. Evol. 40 (4) - [2023-04-04; online 2023-03-23]

Protein domains that emerged more recently in evolution have a higher structural disorder and greater clustering of hydrophobic residues along the primary sequence. It is hard to explain how selection acting via descent with modification could act so slowly as not to saturate over the extraordinarily long timescales over which these trends persist. Here, we hypothesize that the trends were created by a higher level of selection that differentially affects the retention probabilities of protein domains with different properties. This hypothesis predicts that loss rates should depend on disorder and clustering trait values. To test this, we inferred loss rates via maximum likelihood for animal Pfam domains, after first performing a set of stringent quality control methods to reduce annotation errors. Intermediate trait values, matching those of ancient domains, are associated with the lowest loss rates, making our results difficult to explain with reference to previously described homology detection biases. Simulations confirm that effect sizes are of the right magnitude to produce the observed long-term trends. Our results support the hypothesis that differential domain loss slowly weeds out those protein domains that have nonoptimal levels of disorder and clustering. The same preferences also shape the differential diversification of Pfam domains, thereby further impacting proteome composition.

DDLS Fellow

Jennifer James

PubMed 36947137

DOI 10.1093/molbev/msad073

Crossref 10.1093/molbev/msad073

pmc: PMC10089649
pii: 7083726


Publications 9.5.1