Fossils are cool, but some of us are interested in processes and structures that don’t fossilize well. For instance, if you want to know more about the evolution of mammalian reproduction, you’d best not pin your hopes on the discovery of a series of fossilized placentas, or fossilized mammary glands … and although a few fossilized invertebrate embryos have been discovered, their preservation relied on conditions not found inside the rotting gut cavity of dead pregnant mammals.
You’d think this would mean we’re right out of luck, but as it turns out, we have a place to turn to, a different kind of fossil. These are fossil genes, relics of our ancient past, and they are found by digging in the debris of our genomes. By comparing the sequences of genes of known function in different lineages, we can get a measure of divergence times … and in the case of some genes which have discrete functions, we can even plot the times of origin or loss of those particular functions in the organism’s history.
Here’s one example. We don’t have any fossilized placentas, but we know that there was an important transition in the mammalian lineage: we had to have shifted from producing eggs in which yolk was the primary source of embryonic nutrition to a state where the embryo acquired its nutrition from a direct interface with maternal circulation, the placenta. We modern mammals don’t need yolk at all … but could there be vestiges of yolk proteins still left buried in our genome? The answer, which you already know since I’m writing this, is yes.
First, a little background. It’s not that surprising to find traces of yolk proteins in our genomes, because we also have the evidence of embryology that shows that our embryos still make a yolk sac! Below is a series of diagrams of the human embryo over the last several weeks of the first month of pregnancy, and you can see the large sac hanging from the embryo; it’s a useless fluid filled space that contains no yolk at all, but is homologous to similar structures that form in birds and reptiles.
Sometimes people refuse to believe that we could have a yolk sac, and they don’t trust cartoons, so here’s a photo of a 28-somite stage embryo. The side view on the left nicely shows the branchial arches (they also don’t want to believe in those), but the one on the right is the same embryo rotated, so you can see the huge empty balloon of the yolk sac.
We retain the sac, but what about the contents? Where are the yolk proteins? The primary component of yolk is made from a protein called vitellogenin. Vitellogenin is a large (250-600kD) glycophospholipoprotein, which basically means that it has a protein core that is extensively modified by the addition of sugars, phosphates, and fatty acids — it’s a great greasy lump of protein, fat, and sugar, just the thing growing embryos need to eat. Animals with yolky eggs synthesize vitellogenin in their livers, and transport it the oviducts, the site of egg production, where it is deposited in the yolk sac, and also further broken down into the two major yolk proteins, phosvitin and lipovitellin. Mammals don’t make vitellogenin at all, although there are some interesting similarities between portions of vitellogenins and lipoproteins that we use to transport fats in our circulatory systems (the atherogenic lipoproteins that are the curse of our modern diets may be related to the lipoproteins our ancestors used to feed their embryos.)
We can follow the evolutionary history of the vitellogenin gene. Tetrapod ancestors, 350-400 million years ago, had two copies of the gene, called VIT1 and VITanc (multiple copies of a gene with high demand for its gene product, like yolk proteins, is advantageous for boosting output). Some time before the mammalian lineage diverged from the reptile/bird lineage, there was a duplication of VITanc to form VIT2 and VIT3 … so chickens have 3 vitellogenin genes, VIT1, VIT2, and VIT3.
How do we know that this duplication occured before the mammalian line split off? Because we also have VIT1, VIT2, and VIT3 in our genomes! They are irreparably broken and non-functional, and eroded by time, but Brawand et al. found them, and identified them by sequence similarity and by synteny, or the identity of the adjacent genes.
When non-functional genes, called pseudogenes, like this are found, one thing one can do is estimate the time of loss of function from the amount of decay. Natural selection is a force that maintains genes, and in its absence, they tend to slowly fall apart as they accumulate mutations. Browsing through the genome is like strolling through a run-down neighborhood. Houses that are still occupied will be maintained and kept up. Houses that have been recently abandoned might have an overgrown lawn and broken windows. Houses that have been neglected longer still might show signs of fire damage, or structural collapse, or might have been demolished right down to their foundations. By measuring the divergence of mammalian pseudogenes for vitellogenin from bird vitellogenin genes, for instance, we can estimate the time of loss.
Rather than counting broken windows, in genes we count the accumulation of stop codons (sequences that signal the end of transcription) and indels. An indel is a single insertion or deletion of a stretch of nucleotides in a gene, and in the lineages studied here they occur at a rate of slightly more than 1 x 10-10 per site per year, so it’s like a very slowly ticking clock that gradually scrambles the pseudogene.
The results are summarized in this diagram.
The loss of vitellogenin was not abrupt. VIT1 and VIT3 became nonfunctional about 150 million years ago (note, though, the wide range of possible times, caused by uncertainty in the methods), roughly corresponding to the evolution of eutherian ancestors and after viviparity. VIT2 hung in there until about 70 million years ago, suggesting that maybe those Cretaceous mammals were still pumping a little yolk protein into those yolk sacs, as a supplemental nutrition source.
The monotremes have also lost most of their vitellogenin genes, but still retain one, to no one’s surprise — they still lay eggs. Furthermore, VIT1 was only relatively recently lost, about 50 million years ago.
One other detail in the chart is of interest. It shows that nutritive lactation arose before placentation and loss of the vitellogenin genes. Again, no one has found fossil mammary glands; instead, they looked at genes important in milk production, in particular, the casein milk genes. Casein is a secreted calcium-binding phosphoprotein that is essential for transporting calcium to the embryo, and calcium is a critical growth-limiting mineral during embryogenesis. We have caseins, of course, and the platypus is found to have orthologous casein genes, which tells us that these genes arose before the monotreme and eutherian split.
Taken together, these data tell a story. Lactation evolved first, representing a gradual shift in parental investment from storage of yolk in eggs to later, post-hatching care. This reduced selective constraints on yolk production — three genes were overkill for the level of output needed — and was permissive in allowing the gradual decay of the VIT genes. Viviparity and placentation then made the yolk proteins more and more superfluous, as embryos became more and more reliant on simply tapping directly into the maternal blood supply. The process represents a pattern of change away from stockpiling massive quantities of nutritional supplies for future growth, to a more efficient just-in-time delivery system.
The story is all right there in your genes. You’re walking around carrying the crumbling record of hundreds of millions of years of history — all we need is the tools to extract it and read it.
Brawand D, Wahli W, Kaessmann H (2008) Loss of egg yolk genes in mammals and the origin of lactation and placentation. PLoS Biol 6(3):e63.
Sadler TW (2004) Langman’s Medical Embryology, 9th ed. Lippincott Williams and Wilkins, Baltimore.