Studies of diverse phylogenetic lineages reveal that protein disorder increases in concert with organismic complexity but that differences nevertheless exist among lineages. To gain insight into this phenomenology, we analyzed all of the transcription factor (TF) families for which sequences are known for 17 species spanning bacteria, yeast, algae, land plants, and animals and for which the number of different cell types has been reported in the primary literature. Although the fraction of disordered residues in TF sequences is often moderately or poorly correlated with organismic complexity as gauged by cell-type number (r2<0.5), an unbiased and phylogenetically broad analysis shows that organismic complexity is positively and strongly correlated with the total number of TFs, the number of their spliced variants and their total disordered residues content (r2>0.8). Furthermore, the correlation between the fraction of disordered residues and cell-type number becomes stronger when confined to the TF families participating in cell cycle, cell size, cell division, cell differentiation, or cell proliferation, and other important developmental processes. The data also indicate that evolutionarily simpler organisms allow for the detection of subtle differences in the conserved IDRs of TFs as well as changes in variable IDRs, which can influence the DNA recognition and multifunctionality of TFs through direct or indirect mechanisms. Although strong correlations cannot be taken as evidence for causeand-effect relationships, we interpret our data to indicate that increasing TF disorder likely was an important factor contributing to the evolution of organismic complexity and not merely a concurrent unrelated effect of increasing organismic complexity.
- Cell-type number
- Intrinsically disordered protein (IDP)
- Transcription factors
ASJC Scopus subject areas
- Ecology, Evolution, Behavior and Systematics