(Please see here for previous posts in this series.)
In order to understand how inheritance works and the mathematics involved, it may be helpful to have a quick summary of some basic facts about genetics (a little simplified), using the human genome for concreteness.
All the genetic information in our bodies is found in the DNA, whose famous double helix structure was discovered in 1953. Thanks to the Human Genome Project, we now have a complete map of the DNA of humans, called the human genome, and know that it consists of a sequence of 3.1647 billion sites arranged in a row, each site containing one of four complex molecules (called bases) labeled A, C, T and G. It is this long arrangement of the four bases that define each of us genetically. Almost 99.9% of the arrangement of these bases is identical in all humans, and about 98% is identical between chimpanzees and us.
Human DNA is not a single long strand of bases however, but is broken up into 23 pairs of chromosomes, one of each from each parent, making 46 chromosomes in all. A gene is a contiguous string of sites on a chromosome that on average contains 3,000 sites, although the sizes vary greatly, with the largest gene being 2.4 million sites long. Each gene contains the code for manufacturing a specific protein in the body and it is these proteins that determine how the various systems and organs in the body function.
The first 22 chromosome pairs referred to above have the same sequence of gene arrangements along their length, but the two specific genes (called ‘alleles’) that they contain at any given gene location could be different. So while both chromosomes would have genes for eye color at identical locations along the chromosome, one might code for blue eyes while the other might be slightly different and code for brown eyes. One of the genes might be dominant and the other recessive, resulting in just the dominant quality being the one that is seen in the actual organism.
Hence in 22 pairs of the chromosomes, each member of the pair contains the same kind of genetic information, which differ only in detail. Only the two in pair #23, which consists of the X and Y chromosomes that distinguish the sexes, differ considerably in basic structure. So it is sufficient for the purpose of cataloging the human genome to identify the arrangement of just 24 chromosomes, one from each of the first 22 pairs, plus the X and Y from pair #23. These 24 chromosomes vary in length from 50 million to 250 million bases,
The genes specify the code for manufacturing proteins and each protein is made up of a string of amino acids. How the genes specify the order of amino acids to be put together to make up the proteins to be produced is that three consecutive base sites in the gene either specify the identity of a single amino acid to be made or alternatively signals an end to the process if the protein has been completed. There are twenty distinct amino acids in all and as you read along the string of gene bases, every three consecutive sites specify which amino acid is to be added on to what has already been produced. The process continues until a sequence of three bases signals that the process should stop since the required protein has been completed. That protein is then released into the body.
The total number of human genes in the DNA is now estimated to be about 20,000-25,000, about the same as possessed by mice and fish. Even the lowly nematode worm has over 20,000 genes, while the fruit fly has over 13,000 and yeast has over 6,000. Bacteria such as E. coli, and those that cause salmonella and staph infections have genes that number in the range 1,500 to 4,500. (The Making of the Fittest, Sean B. Carroll, 2006, p. 77) About a thousand genes are found in every single organism, evidence of how we are all linked together, descended from a common ancestor who lived over a billion years ago. (Almost Like a Whale Steve Jones, 1999, p. 376)
For humans, all the genes are distributed in the chromosomes, with chromosome #1 containing the most genes (2,968) and chromosome Y containing the fewest (231). Although the portions of the DNA that contain genes are the most useful functionally (since they are the ones that cause proteins to be produced), they constitute less than 2% of the DNA, and of these genes, the functions are still unknown for over 50% of them. Repeated sequences of bases in the DNA that do not cause proteins to be made are called “junk DNA” and while they seem to have no known function (although very recent research throws this assumption into doubt), they can shed a lot of light on how life evolved.
The double helix structure of DNA explains how it is that cells can multiply by copying themselves with such accuracy during normal cell growth (called mitosis). If the copying mechanism were perfect, then no new genetic information would be created and species would never change. But fortunately for evolution, the copying mechanism is subject to small errors and when this happens during the creation of germ (or sex) cells that are the cells that are involved in reproduction (i.e., the sperm and ovum), the resulting changes are then passed on down to the next generation. (The creation of these germ cells by the body is called meiosis.) This is how random changes in genetic information leads to the next generation of organisms having new properties.
We now have, with the discovery of the double helix of DNA, far more detailed knowledge than Darwin ever had about how these mutations occur. The next question to be examined is whether these mutations occur at a sufficiently rapid rate to explain the facts of complexity we see around us.
Next in this series: The sufficiency of the mutation rate
POST SCRIPT: Impeachment
There is an increasing sentiment in the country to impeach Bush and Cheney.
Independent documentary filmmaker Robert Greenwald has made a short film making the case for impeaching Cheney, and there is also a petition that you can sign.