Is the genome 100% functional?

Dan Graur answers that question.

And the answer is no. Not even close. Can’t be.

You may be pleased (or not) to learn that the ENCODE project has been awarded over $30 million to keep rationalizing causal function into selected effect function (see the video for an explanation.)


  1. Nathaniel Tagg says

    Nice. So simple even a physicist can understand it!

    I suppose the creationist rebuttal would have to be that God keeps deleterious mutations from happening in His children? Those mutation rates are measured entirely in samples that God has forsaken?

  2. Reginald Selkirk says

    7 billion bites (sic)

    Do not do your technical learning from the technically illiterate.

  3. says

    Isn’t this just a terminology dispute? Graur says it’s functional if you’re more or less fit from it, ENCODE says it’s functional if you can raise biomedical grant money from it :)

  4. =8)-DX says

    Is the selection pressure Graur’s wife exerts to change junk into rubbish a sharp clip over the ear? =P Lovely lecture, just makes me think “Old Codger”.

  5. prae says

    I wanted to ask a question, but the video answered it. So basically, you need junk DNA in order to “catch” mutations, so that the probability of them breaking a functional gene stays low? Does this mean, we can’t clean up our DNA? As a software developer, that somewhat bothers me. Couldn’t we at least overwrite it all with some unused codon?

  6. kevinkirkpatrick says

    Since mutation rate is computed per-site (not mutations-per-genome-copy), I’m not entirely clear on how the existence of junk DNA reduces the requisite offspring for population survival. At least, when I try to simplify this conceptuallly, I wind up at a place where the existence of (and amount of) junk DNA has no influence on “how many offspring are necessary to produce a viable next-generation?” Am I oversimplifying one of the assumptions of the mathematical model?

    Here’s my simplified 3-gene organism:
    and 4-gene variant (with junk DNA):
    C = how-to-copy gene, S = how-to-survive gene, R = gene reader, J=junk

    Say the mutation rate is .5 per gene per reproduction.

    For the 3-gene (100% functional) organism, on average, 1 in 8 offspring will survive and reproduce: 1/2 will have good C, 1/2 will have good S, and 1/2 will have good R.

    But it doesn’t seem like this equation will change for the 4-gene (junk-DNA) variant. Yes, instead of 8 possible outcomes, there will be 16 (each of the original 8 coupled with a meaningless yes/no mutation of the J gene). But of those 16 outcomes, only 2 will be viable… which is the same percentage.

    What basic element of the mathematical model discussed in this video, relating “% of junk” DNA to “offspring needed to avoid extinction”, am I missing?

  7. CreativeEntropy says

    Prae and kevin,

    I think you’ve just slightly jumbled the interpretation. The number of offspring calculations he makes assume a fixed genome size ( human sized) and varying degrees of junk ( i.e. the size of the functional region vs junk region vary but sum to the same total size). For a genome of that size a given number or total mutations are expected to occur across the whole genome in an unbiased fashion. However, these mutations will only be deleterious if they alter functional DNA. Given more functional DNA, we expect more deleterious mutations. Therefore given constant genome size and mutation rate, we’d expect a consistent number of mutations, but the number of deleterious mutations increases with an increase in the ratio of functional to junk DNA. Given more deleterious mutations, more offspring must be produced to maintain a stable population size.

    I.e. only a certain percentage can reasonably be functional given a certain genome size. Otherwise maintaining the population would require untenable levels of fecundity.

  8. prae says

    @CreativeEntropy: yes, that’s what I said. Or, tried to say. You need a certain amount of junk DNA (depending on the amount of your functional DNA), so that mutations, which occur at a somewhat fixed rate and independent of the size of your genome, are more likely to “hit” junk DNA than something functional.

    The question is: what does cause such mutations? It must be something which actively targets DNA. Radiation or other sources of random molecular damage should be less likely to damage DNA if it gets smaller. Viruses? These parasitic self-replicating DNA sequences? Some DNA-breaking enzymes?

  9. CreativeEntropy says

    The thing is, it’s not that you need junk (under the assumptions of the model). If you had a big genome that’s 90% junk or a 10x smaller genome that’s all functional, the number of deleterious mutations in the germ line should be equivalent (under the reasonable assumption that these mutations result primarily from errors of DNA replication). The number of total mutations resulting from replication errors increases with genome size, but the number of these that are deleterious depends on the size of the functional subset. Adding junk increases total mutational burden without altering the expected number of deleterious mutations. The number of deleterious mutations in the model is independent of the amount of junk and is solely dependent on the total amount of functional sequence.

  10. says

    I was thinking about this yesterday, but if your blood type offers no particular survival advantage, would an evo biologist consider the sites that determine blood type functional?

  11. kevinkirkpatrick says


    Thanks, I had to read it 3 times, but it finally clicked. My model wasn’t off, but my question was.

    What I was asking was:
    Given organism [C, S, R, J1, J2, J3, …. Jn], and mutation rate of 0.05, why does increasing n reduce the number of offspring required ?
    As you noted, that’s a nonsensical question: it doesn’t. Junk DNA doesn’t actually “protect” critical/germline genes (it does not reduce the chances of those genes being mutated).

    It’s also backwards: I started with an organism where we know how many genes are critical how many are junk. But this is the “unknown” ratio that model is attempting to constrain.

    My hypothetical should have been laid out thusly:
    An steady population of organisms has 10 genes. The copy rate is 0.5 per gene. If each generation produces 4 offspring, how many of the genes (at most) can be critical? The answer is: at most, only 2 of the 10 genes (20 percent) can be critical. If 3 were critical, then (on average) only .5 offspring would survive to the next generation.

    Yes, I realize the actual math is far more nuanced than this; but am at least closer to getting the basic principal of Graur’s case?

  12. lemurcatta says

    @ 12

    Yes, genes for blood type are functional, they code for a product that has a physiological function, and changes in these regions are visible to selection. Contrast that with regions like microsatellites in non-coding regions, which probably do absolutely nothing functional such as code for a protein or RNA. Polymorphisms in these regions are not visible to section

  13. CreativeEntropy says

    The basic principal is I think, fairly intuitive although I’ve perhaps done a poor job as of yet stating it succinctly. There are limitations on the fidelity with which DNA is replicated, such that for a given length of DNA in a given species, a predictable number of errors in replication (mutations) will occur. As functionality is generally dependent on the conservation of a specific DNA sequence, mutations in a functional DNA sequence are frequently deleterious to the function of that sequence. If the function of a particular sequence is critical to the survival and reproduction, then offspring with a deleterious mutation in that sequence will not contribute to future generations.

    The larger the total length of functional DNA that is critical to survival and reproduction, the more likely any given offspring will bear a mutation in these critical sequences, and therefore, the more offspring one must have to maintain the population (to reliably produce offspring without deleterious mutations).

    Given that we have a good estimate of the rate of germ line mutations in human, we can estimate the total length of functional sequence in the human genome. The actual size of the human genome is considerably larger than the upper bound for these estimates, so a sizeable portion of our genome must be “junk,” i.e. DNA without a conserved sequence with an important function.

    Consistent with this notion, the cumulative length of the conserved DNA sequences in our genome with a known or predicted function is well within the boundaries of the reasonable.