I hate it. Mainly because I get swamped with people asking me to explain crap, and even more, because there’s a whole lot of people who enthusiastically embrace the crap. The crap in question is this press release from the University of Leiden, Second layer of information in DNA confirmed.
The press release is bullshit, OK? But for some reason, people really want to hear that there is some other magical kind of external information that, I don’t know, frees them from the tyranny of genetics, or something. See also epigenetics, which also appeals to the lay public for all the wrong reasons. The paper isn’t talking about a “second layer of information” — it’s talking about mechanical effects of the nucleotide sequence. That’s it. Everyone can calm down now.
The paper itself isn’t bad, it’s just totally unsurprising and takes a low-level physical approach to a phenomenon that every molecular biologist already knows about. So let me strip away the hype and give a simple explanation of what it says.
One of the ways genes get regulated is by the presence of proteins that bind to the DNA strand; that binding can hinder or promote gene expression. Basically proteins package up regions of DNA, and they can wrap it up tightly or loosely.
Now DNA is just a chemical, with local chemical properties that are determined by the nucleotides in the strand. DNA that is rich in C and G, for instance, has subtley different chemical properties than DNA that is rich in A and T. Those properties can influence the binding of those packaging proteins, and they can also effect how the DNA strand folds.
The DNA code is degenerate — that is, there are multiple triplets (called a codon) of nucleotides that can specify an amino acid in the protein a gene produces. That means the gene can have a codon that contains a C or a T, for instance, and specify the same amino acid.
The codon used can influence those subtle chemical properties of the DNA strand while still specifying the same amino acid. What that means is that codon usage in different sequences that code for exactly the same protein may have different mechanical properties.
That’s it! What the paper did was carry out simulations of variations in the sequence of a model strand of DNA, keeping the translation of the strand into protein constant, but using different alternative codons. It was an exercise in varying synonyms. For example, imagine a sentence like this:
“My dog has fleas.”
Then we generate a lot of synonymous versions.
“My pet has fleas.”
“My animal has fleas.”
“My dog has parasites.”
“My pet has parasites.”
Und so weiter.
We then ask whether each of those has exactly the same interpretation, and the answer is, unsurprisingly, no. It’s the same in this simulation, only they’re replacing codons in a 147 base pair sequence with synonyms, and then calculating the mechanical properties of the result and simulating the energetics of that sequence binding with packaging proteins.
This is not new. Molecular biologists have known about the effects of codon usage on gene expression for decades. But it’s nice of the physicists to come along and tell us what we already knew from a different perspective.
But please, publicists: it is not a “second layer of information”. Everything in this paper was about modifying the nucleotide sequence, which is the same old primary heritable layer of information we’ve been talking about all along, and it isn’t hidden or mysterious.
But I know they won’t.