Scope: We analyse domain superfamily expansions across a variety of eukaryotic genomes.
Summary:
During the course of evolution, new proteins are produced very largely as the result of gene duplication, divergence and, in many cases, combination. This means that proteins, or protein domains belong to families or, in the cases where their relationships can only be recognised on the basis of structure, superfamilies whose members are descended from a common ancestor. The size of superfamilies can vary greatly. Also, during the course of evolution organisms of increasing complexity has arisen. In this paper we determine the identity of those superfamilies whose relative sizes in different organisms is highly correlated to the different complexity of the organisms
As a measure of the complexity of 38 uni- and multicellular eukaryotes we took the number of different cell types of which they are composed. Of 1219 different superfamilies, there are 194 whose sizes in the 38 organisms have a high correlation with the number of cell types in the organisms. We give outline descriptions of these superfamilies. Half are involved in extra-cellular processes or regulation and smaller proportions in other types of activity. Half of all superfamilies have no significant correlation with complexity. We also determined whether the expansions of large superfamilies correlate with each other. We found three large clusters of correlated expansions: one involves vertebrate and plant superfamilies, one those in vertebrates and one those in plants.
Files from the publication: