All of us mammals have pretty much the same set of genes, yet obviously there have to be some significant differences to differentiate a man from a mouse. What we currently think is a major source of morphological diversity is in the cis regulatory regions; that is, stretches of DNA outside the actual coding region of the gene that are responsible for switching the gene on and off. We might all have hair, but where we differ is when and where mice and men grow it on their bodies, and that is under the control of these regulatory elements.
A new paper by Fondon and Garner suggests that there is another source of variation between individuals: tandem repeats. Tandem repeats are short lengths of DNA that are repeated multiple times within a gene, anywhere from a handful of copies to more than a hundred. They are also called VNTRs, or variable number tandem repeats, because different individuals within a population may have different numbers of repeats. These VNTRs are relatively easy to detect with molecular tools, and we know that populations (humans included) may carry a large reservoir of different numbers of repeats, but what exactly the differences do has never been clear. One person might carry 3 tandem repeats in a particular gene, while her neighbor might bear 15, with no obvious differences between them that can be traced to that particular gene. So the question is what, if anything, does having a different number of tandem repeats do to an organism?
Fondon and Garner address this question by first looking for populations that exhibit large and obvious morphological differences between individuals, and then looking within their genome to see if those differences can be correlated with the number of tandem repeats present. The population they are investigating are domestic dogs. Dogs are not only diverse, but dog breeders are notoriously picky about shape and character, and purebred dogs have been under intense selection for specific attributes. Once a range of morphologies in a particular character, such as the shape of the snout, have been identified, one can ask whether that trait is reflected in the number of repeats in any genes.
The authors examined 142 dogs from 92 different breeds, and looked at 37 different tandem repeats in 17 genes in each. The genes selected were developmentally significant transcription factors that were at least suspected of playing a role in the formation of specific morphologies. 15 of the 17 genes turned out to have multiple alleles varying in the length of their repeats.
That there would be this substantial amount of genetic variation in tandem repeat number isn’t at all surprising. Tandem repeats are subject to very high mutation rates, up to 100,000 times greater probability than a point mutation, because they are prone to a kind of error called slipped-strand mispairing. Because they contain many copies of the same short sequence over and over, it is easy for the two strands of DNA to get misaligned in this local region—the GTAC on one strand could base-pair with the first CATG in the other strand, or the second, or the third. If the strands are mispaired, the replicating enzymes can err and either clip off some of the repeats, or add extra repeats. It’s a special kind of error, in that the DNA changes aren’t to random nucleotides, but instead produce only different numbers of repeats.
Note that this lack of fidelity in copying tandem repeats means that they are only going to be found in regions of genes that can tolerate some variability in the length of the resulting protein. That’s interesting in itself, since it says these proteins are capable of functioning with ±30 or more amino acids in their final length.
Also, slipped-strand mispairing can be foiled by point mutations, even to synonymous codons, within the tandem repeat. A small change in the sequence gives the replication machinery a local difference that is used to properly align the two strands, and a stable tandem repeat will accumulate these small changes and lose its repeated character. On the other hand, a deletion caused by slipped-strand mispairing can remove the point difference, and subsequent mispairing can then expand the sequence, producing a repeat free of imperfections. One measure of how much selection for variation has been going on within a tandem repeat is its purity: if there are few interruptions in the perfection of the repeat, there has been much deletion and expansion going on within the sequence in its history. If there are multiple deviations from perfect repetition, then the sequence has not undergone much length variation in the recent past.
The purity of the sequence is therefore a measure of how much selection for new variants has been going on in the lineage. The authors compared the same repeat loci in humans and dogs, and found that dog repeats were purer in 29 of 36 cases, and of the same purity in 7 cases. This strongly suggests that the variations in dogs aren’t just random, neutral changes, but are the outcome of recent selection at these loci.
OK, already, so there are these interesting kinds of gene variants in dogs, and they have apparently undergone selection. What effect do the repeats have?
I’ll describe the two main examples from the paper. The first is a gene called Runx-2 (runt-related transcription factor 2), which is related to the Drosophila pair-rule gene (a gene that is involved in segmentation), runt. In vertebrates, one of the functions of Runx-2 is to regulate the differentiation of osteoblasts, the cells responsible for laying down bone. Runx-2 contains two repeats, one coding for 18-20 glutamines (the poly-Q region), and another coding for 12-17 alanines (the poly-A region). A statistical comparison of the total repeat length (Q+A) with various parameters of the skull size revealed a correlation with the midface length, and a property called clinorhynchy, or dorsoventral nose bend. What’s clinorhynchy? If you’ve seen a bull terrier, you know what’s distinctive about them: that long nose with a downward droop.
Bull terriers tend to have a short pair of tandem repeats, and they have long midfaces and pronounced downturn of the snout. They have been intentionally selected for this, and museum specimens over the last 70 years show increased prominence of this feature.
This is cool stuff so far, but I have to tell you, it gets a little more complicated. It’s not as simple as short repeat length→downturned snout. One of the ways transcription factor activity is regulated is by binding to one another; chains of amino acids can affect how the transcription factors interact. It turns out that polyglutamine can increase the rate of transcription, while polyalanine reduces it, and the Runx-2 protein has both a polyglutamine (poly-Q) and polyalanine (poly-A) chain. What might matter more in a situation where two competing components modulate activity is the ratio of poly-Q to poly-A, and lo, the poly-Q/poly-A ratio shows an even stronger correlation with clinorhynchy than does poly-Q+poly-A.
The second gene example is Alx-4 (aristaless-like homeobox 4). Alx-4 is also related to a transcription factor found in Drosophila, and knocking out the gene in mice produces six-toed mice. One specific allele of this gene, Alx-4Δ51, was found in only one breed of dog, the Great Pyrenees. One peculiarity of this breed is hindlimb polydactyly—purebreds are supposed to have a double dewclaw, for a total of six digits on the hindleg. The Alx-4Δ51 is a deletion, which knocks out 51 nucleotides from the tandem repeat, for a loss of 17 amino acids. All of the Great Pyrenees with polydactyly have this 17aa deletion; one Great Pyrenee without polydactyly had the full length tandem repeat.
The good news about all of this is that it represents a demonstration of another mode of relatively rapid addition of morphological diversity to a population, and that we have another mechanism for fine-tuning evolution. These tandem repeats are common in the vertebrate genome, so this could clearly be a reservoir of variation and a robust and flexible way to add new variations to a population.
There are some limitations to this study, though. First, it’s focused on an extreme case: purebred dogs that have been experiencing very strong selection for specific and in some cases, outright deleterious characters. We simply don’t know how important this mode of evolutionary change is under less artificial conditions. Secondly, so far we’re just seeing correlations, not experimental perturbations. They’re darned convincing correlations, but at some point down the road it would be good to see direct manipulation of the Q/A ratio of the Runx-2 gene in a collie, for instance, to give it the downturned nose of a bull terrier. And finally, it may just be me, but I’d like to see developmental studies of the patterns of Runx-2 and Alx-4 gene expression in dog embryos to see exactly how these variations play out.
Still, it’s got me wondering. I’ve got this knobby nose that I can see to varying degrees in my father and paternal grandmother. I wonder if it can be traced to differences in tandem repeat length in some transcription factor?
Fondon JW, Garner HR (2004) Molecular origins of rapid and continuous morphological evolution. PNAS 101(52):18058-18063.