The pandemic with its virus mutations has rekindled my interest in the mathematics of evolution. Way back in 2007, I wrote a 20-part series of blog posts on evolution. This was way ‘out of my lane’, as the kids say these days, since I am a physicist and have zero formal training in biology, having taken my one and only course in that subject when I was in eighth grade. But I did this for the same reason I write about a lot of things, in order to force myself to learn about a topic that interests me and to sort out my ideas.
One of the things that had intrigued me is how a mutation that occurs in just one organism could become not just dominant in the population but end up being the only one. In some posts in the series, I discussed the mathematics of how this happens, which I have edited and reproduce here.
The way that even a very small natural selection advantage can result in that variety dominating a species can be appreciated using the more familiar example of compound interest. Suppose a parent gives each of two children $1,000 at the same time. One of the children invests in a bank that offers an interest rate of 5.0% while the other, being slightly more thrifty, shops around and invests in a different bank at 5.1%. Although they start out with the total money being split 50-50, in 7,000 years the second child (or rather that child’s descendants) will have 99.9% of the total money, thanks to that very small advantage in the annual rate of return.
It is exactly this kind of differential survival rate that plays such an important role in natural selection. Even minute differences in fitness can result, over the long term, in the runaway domination of one variety. To see how fast this can happen, population geneticists have carried out calculations.
Suppose one variety has a mutated gene that has a very slight fitness advantage over the existing gene. ‘Fitness advantage’ can be quantified by defining the fitness w as the measure of the individual’s ability to survive and reproduce. (The concept of fitness is a combination of the organism’s ability to survive for any length of time (at least until its reproducing age is over) and its fecundity in terms of the number of offspring it produces.) Suppose the original gene has fitness w=1 and the new mutation has fitness w=1+s, where s is the selection advantage. The selection advantage is a measure of how much more likely it is that that particular variety will propagate itself in future generations when compared with the standard type. So if, on average, the new mutated variety produces 101 fertile adult descendants while the same number of the standard organism produces 100, then s=0.01.
When this selection advantage is included in the calculation, the number of generations T it will take for a mutation to increase its frequency in the population from an initial value of f to a final value of F is given by the formula T=(1/s)ln[F(1-f)/f(1-F)], where ‘ln’ stands for the natural logarithm. (Molecular Evolution, Wen-Hsiung Li, 1997, p. 39)
So if we start with a trait that is present in just 0.1% of the population (i.e., f=0.001), and if this has a small selection advantage of size s=0.01, this variety will grow to 99.9% (F=0.999) of the population in just under 1,400 generations, which can be a very short time, especially on the geological scale, for organisms that do not have long lifespans.
In the above, we started with an initial population frequency of f. But a mutation will often start out in just a single organism. How does it work then? Most of the time, even a favorable mutation will disappear because of random chance because (say) that mutated organism died before it produced any offspring or it did produce a few and that particular gene was not inherited. But on occasion that mutation will spread. How likely is it that such a single mutation will spread to every single organism (i.e., become ‘fixed’ in the population)?
Suppose that you have a population of organisms of size N and they all start out having the same gene at a particular position (called the ‘locus’) on one of the chromosomes that make up the DNA. Now suppose a random mutation occurs in just one organism, the way that it happened in the shift from violet to UV sensitive sight in some birds. When one is not dealing with deterministic systems involving smoothly varying numbers (as was done in the previous case), a different kind of calculation (based on probabilities and known as ‘stochastic’) has to be done, and in this case the expectation value for the number of generations T taken for the single new mutation to spread all over and become fixed in the population (i.e. to spread to 100% of the organisms) is given by T=(2/s)ln(2N) generations, where ‘ln’ stands for natural logarithms. (Molecular Evolution, Wen-Hsiung Li, 1997, p. 49)
Even if s is taken as a very small advantage of size 0.01, for a population N of one million, the average time taken for just a single mutation to become fixed is just 2,900 generations. So we see that mutations occurring in a single organism can become universal in a very short period on the geological time scale.
There are two important points that need to be emphasized.
There first is that even a very small selection advantage is sufficient to have that mutation dominate the species. This means that the advantage may not be even visible in the organism itself, which may look like every other organism in the species. For example, an eye mutation that works better by just a tiny bit may look like every other eye. Thus we should not think in terms of big changes for natural selection to work.
The second point is that even starting from a single mutation, as long as it takes hold (which has a probability of 2s of happening) and does not disappear and has an selection advantage however small, the mutation can spread surprisingly rapidly in the population and become universal and form the basis for future mutations.
It is interesting that even if there is no survival advantage to the new gene (i.e., s=0 and the mutation is said to be ‘neutral’), the mutation can on occasion still spread and become fixed, except that now the average time taken is much longer and given by T=4N generations. So that for a population of one million it would take on average about 4 million generations for a neutral mutation to spread everywhere, as compared to just 2,900 generations for a selection advantage of 0.01.
The unit of time that is relevant in evolution is the generation. For large organisms, this could be in years. But in the case of viruses, the reproductive cycle is measured in just hours so changes in population size can happen very quickly on the human time scale.

Spherical cow. You noticed that people don’t live that long, but you overlook other assumptions you made: that neither child touches the principal, and taxes don’t exist, and that no political/financial system is likely to last that long.
Factors you overlook in the present example:
You fail to differentiate between haploid and diploid organisms, mixing examples of each together (e.g. viruses, birds)
Sex exists.
A diploid mutation might be dominant or recessive. This will affect your maths.
The mutation might be linked by chromosomal proximity to another locus that has an even stronger selection coefficient, positive or negative.
That’s part of my argument, completely tongue in cheek, that beer is healthier for your than soda pop.
There has just, barely, been enough time for the generations of humanity to adapt to beer as a food source since it was invented. That’s not even close to true for soda pop. Who knows what accumulated poison may be in that stuff.
IIRC, Darwin himself worried that a minor-though-advantageous change would most often just disappear in the statistical noise of a population of any size, thus weakening the effects of natural selection to near-meaninglessness.
Having no concept of genetics as such (he even possessed the issue of the journal in which Mendel published results of his pea garden experiments, but -- knowing only elementary German -- had never opened it), Darwin postulated inheritance as a mixture of fluids (“gemmules”) and thought innovations would disappear by dilution. (Such concerns kept him from publishing his breakthrough for decades: a luxury few scientists could indulge today.)
According to (I think) his autobiography, he eventually saw the rut of a wagon wheel in a road, realized that a temporary period of isolation for a small sub-population would allow the innovation to get a foothold and a fighting chance, heaved a sigh of relief and went back to fretting about other details. Still, the historic reality of geographic sequestration (e.g., the blond hair and blue eyes of those ancient northwestern European freaks) may have mattered quite a bit in actual evolution, and surely provides hours of entertainment for the whole mathematical family in modeling population biology.