Genes and their interaction networks determine the phenotype of an organism—what it looks like and how it behaves. One of the biggest problems in modern evolutionary biology is understanding the relationship between genes and phenotypes. The prevailing theory is that all animals are built from essentially the same set of regulatory genes—a genetic toolkit, and that phenotypic variation within and between species arises simply by using shared genes differently. Scientists are now generating a vast amount of genomic data from an eclectic mix of organisms. These data are telling us to put to bed the idea that all life is underlain by a common toolkit of conserved genes. Instead, we need to turn our attention to the role of genomic novelty in the evolution of phenotypic diversity and innovation.
The idea of a conserved genetic toolkit of life comes from the 'evo-devo' (evolutionary and developmental biology) world. In short, it proposes that evolution uses the same ingredients in all organisms, but tinkers with the recipe. By expressing genes at different times in development and/or in different parts of the body, the same genes can be used in different combinations to allow evolvability, generating phenotypic diversity and innovation. Animals look different not because the molecular machinery is different, but because different parts of the machinery are activated to differing degrees, at different times, in different places and in different combinations. The number of combinations is huge, and so this is a plausible explanation for the development of complex and diverse phenotypes from even a small number of genes. For example, humans have a mere 21,000 genes in our genome, and yet we are arguably one of the most complex products of evolution.
A text-book example is the super-controller of development, Hox genes—a set of genes which tell bodies where to develop heads, tails, arms, legs, in every major animal group. Hox genes are in mice, worms, humans… they are inherited from a common ancestor. Other examples of toolkt genes are those that control eye development, or hair/plumage colouration. Toolkit genes are old, present in all animals and they do pretty much the same thing in all animals. There is no denying that conserved genomic material forms an important part of the molecular building blocks of life.
However. We can now sequence de novo the genomes and transcriptomes (the genes expressed at any one time/place) of any organism. We have sequence data for algae, pythons, green sea turtles, puffer fish, pied flycatchers, platypus, koala, bonobos, giant pandas, bottle-nosed dolphins, leafcutter ants, monarch butterfly, pacific oysters, leeches…the list is growing exponentially. And each new genome brings with it a suit of unique genes. Twenty percent of genes in nematodes are unique. Each lineage of ants contains about 4000 novel genes, but only 64 of these are conserved across all seven ant genomes sequenced so far.
Many of these unique ('novel') genes are proving important in the evolution of biological innovations. Morphological differences between closely related fresh water polyps, Hydra, can be attributed to a small group of novel genes. Novel genes are emerging as important in the worker castes of bees, wasps and ants. Newt-specific genes may play a role in their amazing tissue regenerative powers. In humans, novel genes are associated with devastating diseases, such as leukaemia and Alhzeimer's.
Life is genomically complex, and this complexity plays a crucial role in evolving diversity of life. It's easy to see how an innovation can be improved through natural selection, e.g. once the first eye evolved, it was subject to strong selection to increase the fitness (survival) of its owner. It is more challenging to explain how novelty first originates, especially from a conserved genomic toolkit. Darwinian evolution explains how organisms and their traits evolve, but not how they originate. How did the first eye arise? Or more specifically how did that master regulatory gene for eye development in all animals first originate? The capacity to evolve novel phenotypic traits (be they morphological, physiological or behavioural) is crucial for survival and adaptation, especially in changing (or new) environments.
A conserved genome can generate novelties through rearrangements (within or between genes), changes in regulation or genome duplication events. For example, the vertebrate genome has been replicated in their entirety twice in their evolutionary history; salmonid fish have undergone a further two whole genome duplications. Duplications reduce selection on the function of one of the gene copies, allowing that copy to mutate and evolve into a new gene whilst the other copy maintains business as usual. Conserved genomes can also harbour a lot of latent genetic variation—fodder for evolving novelty—which is not exposed to selection. Non-lethal variation can lie dormant in the genome by not being expressed, or by being expressed at times when it doesn't have a lethal effect on the phenotype. The molecular machinery that regulates expression of genes and proteins depends on minimal information, rules and tools: transcription factors recognise sequences of only a few base-pairs as binding sites, which gives them enormous potential for plasticity in where they bind. Pleiotropic changes across many conserved genes using different combination of transcription, translation and/or post-translation activity are a good source of genomic novelty. E.g. the evolution of beak shapes in Darwin's finches is controlled by pleiotropic changes brought about by changes in the signalling patterns of a conserved gene that controls bone development. The combinatorial power of even a limited genetic toolkit gives it enormous potential to evolve novelty from old machinery.
But the presence of unique genes in all evolutionary lineages studied to date now tells us that de novo gene birth, rather than a reordering of old ingredients, is important in phenotypic evolution. The over-abundance of non-coding DNA in genomes is less puzzling, if they are a melting pot for genomes to exploit and create new genes and gene function, and ultimately phenotypic innovation. The current thinking is that genomes are constantly producing new genes all the time, but that only a few become functional.
Our story started simply: all life is a product of gentle evolutionary tinkering of a shared molecular toolkit. The unimaginable time has arrived where we can unpack the molecular building blocks of any creature. And these data are shaking things up. What a surprise? Not really. Perhaps the most important lesson from this is that no theory is completely right, and that good theories are those that continue evolving and embracing innovation. Let's evolve theories (keeping the bits that are proven correct), not retire them.