The Irish Genome

What a curious paper — it’s fine research, and it’s a useful dollop of data, but it’s simultaneously so 21st century and on the edge of being completely trivial. It’s like a tiny shard of the future whipping by on its way to quaintness.

Researchers have for the first time sequenced the genome of an Irishman, a fellow confirmed to be the product of at least 3 generations of fully Irish ancestors.

It’s a good piece of work, another piece in the puzzle of human genomics, but it’s also a little bit odd. I’m always excited to see another organism’s genome sequenced, the first marsupial, the first sea anemone, the first avian, etc., and it’s also become a bit commonplace (oh, another bacterium sequenced…); it’s just weird to see “Irish” announced as a new novel addition to the ranks of sequenced organisms, as if it were Capitella or something. Cool, but a little jarring.

It’s also a genre with limited prospects. If you’re busy sequencing the first Armenian or the first New Guinean or the first Luxembourger, work fast — I can’t quite imagine that most will warrant a publication, except as a formality, as I imagine this paper is. We’re entering the era of personalized genomics, when anyone will be able to get their sequence done for under a thousand dollars. I don’t imagine that a paper titled “Sequencing and analysis of PZ Myers’ human genome” will get published in Nature. But if anyone wants to try, I’ll gladly send them a few cells and my permission.

Anyway, the paper got the sequence of this Irish fella. They identified many unique single nucleotide polymorphisms that may be useful molecular markers of Irish ancestry; a few of the new alleles seem to be associated with diseases like inflammatory bowel and chronic liver problems. They identified a few genes bearing the signature of positive selection. Here are their conclusions:

The first Irish human genome sequence provides insight into the population structure of this branch of the European lineage which has a distinct ancestry from other published genomes. At 11 fold genome coverage approximately 99.3% of the reference genome was covered and more than 3 million SNPs were detected, of which 13% were novel and may include specific markers of Irish ancestry. We provide a novel technique for SNP calling in human genome sequence using haplotype data and validate the imputation of Irish haplotypes using data from the current Human Genome Diversity Panel (HGDP-CEPH). Our analysis has implications for future re-sequencing studies and suggests that relatively low levels of genome coverage, such as that being used by the 1000 genomes project, should provide relatively accurate genotyping data. Using novel variants identified within the study, which are in linkage disequilibrium with already known disease associated SNPs, we illustrate how these novel variants may point towards potential causative risk factors for important diseases. Comparisons with other sequenced human genomes allowed us to address positive selection in the human lineage and to examine the relative contributions of gene function and gene duplication events. Our findings point towards the possible primacy of recent duplication events over gene function as indicative of a genes likelihood of being under positive selection. Overall we demonstrate the utility of generating targeted whole genome sequence data in helping to address general questions of human biology as well as providing data to answer more lineage-restricted questions.

Hey, it’s data. But I think it will be made much more interesting when it acquires more context. One Irish genome doesn’t give us much information on Irish variation. It’s information to complement the 1000 Genomes Project (the Irish study is not part of that bigger project), which intends to take a nice snapshot of human genetic diversity by sampling 100 individuals from each of 10 distinct populations. Then the hard part comes: comparing and analyzing everything.

Oh, and the digging out from all the ethnic jokes that will appear in the comments.


Tong P, Prendergast JGD, Lohan AL, Farrington SM, Cronin S, Friel N, Bradley DG, Hardiman O, Evans A, Wilson JF and Loftus BJ (2010) Sequencing and analysis of an Irish human genome. Genome Biology (in press)

Blaschko’s Lines

One of the subjects developmental biologists are interested in is the development of pattern. There are the obvious externally visible patterns — the stripes of a zebra, leopard spots, the ordered ranks of your teeth, etc., etc., etc. — and in fact, just about everything about most multicellular organisms is about pattern. Without it, you’d be an amorphous blob.

But there are also invisible patterns that you don’t normally see that are aspects of the process of assembly, the little seams and welds where disparate pieces of the organism are stitched together during development. The best known ones are compartment boundaries in insects. A fly’s wing, for instance, has a normally undetectable line running across the middle of it, a line that cells respect. A cell born on the front half of the wing will multiply and expand its progeny to cover a patch on the surface, but none of its offspring cells will cross over the invisible line into the back half. Similarly, cells born on the back half will never wander into the front.

We can see these invisible lines by taking advantage of mosaicism: generate a fly wing with two genetically distinct cell types, for instance by making one type express a pigment marker and the other not, and the boundaries become apparent. There are many ways we can generate mosaics, but in Drosophila we can use somatic recombination — with low frequency, chromosomes in the fly can undergo crossing over in mitosis, not just meiosis, so sometimes the swapping of chromosome segments will turn a daughter cell that should have been heterozygous for an allele into one that is homozygous, allowing a marker allele to express itself.

i-2f70a2ff4834b36200b3c0e67c9701e4-blas_dros_compartments-thumb-450x233-54592.jpeg
(Click for larger image)

(A) The shapes of marked clones in the Drosophila wing reveal the existence of a compartment boundary. The border of each marked clone is straight where it abuts the boundary. Even when a marked clone has been genetically altered so that it grows more rapidly than the rest of the wing and is therefore very large, it respects the boundary in the same way (drawing on right). Note that the compartment boundary does not coincide with the central wing vein. (B) The pattern of expression of the engrailed gene in the wing. The compartment boundary coincides with the boundary of engrailed gene expression.

It’s like a secret code written in molecules hidden to the eye until you illuminate it in just the right way to expose it. And these lines aren’t just arbitrary, they’re significant. The wing boundary defines the expression of important molecules that define the identity of specific structures. The posterior half of the wing is the domain of expression of a molecule called engrailed, which is part of the machinery that makes the back half a back half. We can also stain a wing for just that gene product, and also expose the hidden lines.

i-bdc64360d6b1c5ea5911e8248898977f-blas_dros_wing.jpeg

We can also mutate the pathway of which engrailed is part, and do interesting things to the fly wing, like turn the back half into a mirror image of the front half. So these lines actually matter for the proper development of a fly.

So you might be wondering if we have anything similar in humans…and no, we don’t have strict compartment boundaries like a fly. However, we do have normally invisible lines and stripes of subtle molecular differences running across our bodies, which are occasionally exposed by human mosaicism. These are marks called the lines of Blaschko, after the investigator who first reported a common set of patterns in patients with dermatological disorders in 1901.

Don’t rip off your shirt and start looking for the Blaschko lines — they’re almost always invisible, remember! What happens is that sometimes people with visible dermatological problems — rashes, peculiar pigmentation, swathes of moles, that sort of thing — express the problems in a stereotypically patterned way. On the back, there are V-shaped patterns; on the abdomen and chest, S-shaped swirls; and on the limbs, longitudinal streaks.

Here is the standard arrangement:

i-352b8c8940ae3412c1ea3d98aa8601e5-blaschko.jpeg

And here are a few examples:

i-83fe4ec386564e5ca9bd0caf2fa69dc9-blas_back.jpegi-372f6cbcc123892cbe750c377bf0c11e-blas_front.jpeg

Note that usually there isn’t a whole-body arrangement of tiger stripes everywhere — there may be a single band of peculiar skin that represents one part of the whole.

Where do these come from? The current hypothesis is that a patch of tissue that follows a Blaschko line represents a clone of cells derived from a single cell in the early embryo. These clones follow stereotypical expansion and migration patterns depending on their position in the embryo; this would suggest that a cell in the middle of the back of a tiny embryo, as it grows larger with the growing embryo, would tend to expand first upwards towards the head and then sweep backwards and around to the front. One way to think of it: imagine taking a piece of yellow clay and sandwiching it between two pieces of green clay into a block, and then pushing and stretching the clay block to make a human figurine. The yellow would make a band somewhere in the middle, all right, but it wouldn’t be a simple rectilinear slice anymore — it would express a more complex border that reflected the overall flow of the medium.

What makes the lines visible in some people? The likeliest example is mosaicism, a difference between two adjacent cells in the early embryo that then appears as a genetic difference in the expanded tissues. There are a couple of ways human beings can be mosaic.

The most common example is X-chromosome inactivation in women. Women have two X-chromosomes, but men only have one; to maintain parity in the regulation of expression of X-linked genes, women completely shut down one X. Which one is shut down is entirely random. That means, of course, that all women are mosaic, with different X-chromosomes shut down in different cells. This normally makes no difference, since equivalent alleles are present on each, but occasionally an X-linked skin disorder can manifest itself in a splotchy pattern. Another familiar example is the calico fur color in female cats, caused by the random expression of a pigment gene on the feline X chromosome.

A more spectacular example is tetragametic chimerism. This rare event is the result of the fusion of two non-identical twins at an early stage of development, producing an embryo that is a kind of salt-and-pepper mix of two individuals. After the fusion, the embryo develops normally as a single individual, but genetic or molecular tests can detect the patches of different genotypes. (No scientific tests can tell whether the individual has two souls, however.)

Another way differences can arise is by somatic mutation. Mutations occur all the time, not just in the germ line; we’re all a mixture of cells with slightly different mitotic histories and some of them contain novel mutations, usually not of a malign sort, or you wouldn’t be reading this right now. But what can happen is that you acquire a mutation in one cell that may predispose its clone of progeny to form moles, or acquire a skin disease, or even tilt it towards going cancerous. It’s a fine thing to undergo genetic screening to find that you may not carry certain alleles associated with cancer, but you aren’t entirely off the hook: you may have patches of tissue in your body that are perfectly normal and functional except that they carry an enabling mutation that occurred when you were an embryo.

One final likely mechanism is epigenetic. Throughout development, genes are switched on and off by epigenetic modification of the DNA. This process can vary: epigenetic silencing doesn’t have to be 0 or 100% absolute, but can differ in degree from cell to cell. It can also vary by chromosome — you’re all diploid, and epigenetic modification may affect one chromosome of a pair to a different degree than the other. Since epigenetic modifications are inherited by the progeny of a cell, that means these differences can be propagated into a clonal patch…that on the skin, will likely follow the lines of Blaschko.

Don’t fret over these lines; they aren’t a disease or a problem or even, in most cases, at all visible. The cool thing about them is that there is a hidden map of your secret history as an individual embedded in silent patterns in your skin — you were not defined as a single, simple, discrete genetic entity at fertilization, but are the product of complicated, subtle changes and errors and shufflings and sortings of cells. We’re all beautiful pointillist masterpieces.

Excellent interview with Craig Venter

Spiegel has a wonderful interview with Venter. The more I hear from Venter, the more I like him; he’s very much a no-BS sort of fellow. He’s the guy who really drove the human genome project to completion, and he’s entirely open about explaining that its medical significance was grossly overstated.

SPIEGEL: So the significance of the genome isn’t so great after all?

Venter: Not at all. I can tell you from my own experience. I put my own genome on the Internet. People had the notion this was the scariest thing out there. But what happened? Nothing.

There really was a lot of hysteria in the early days about how the insurance companies would abuse the information in the genome, and there was also the GATTACA dystopia. None of it has, and I daresay none of it will, come to pass.

Venter: That’s what you say. And what else have I learned from my genome? Very little. We couldn’t even be certain from my genome what my eye color was. Isn’t that sad? Everyone was looking for miracle ‘yes/no’ answers in the genome. “Yes, you’ll have cancer.” Or “No, you won’t have cancer.” But that’s just not the way it is.

SPIEGEL: So the Human Genome Project has had very little medical benefits so far?

Venter: Close to zero to put it precisely.

SPIEGEL: Did it at least provide us with some new knowledge?

Venter: It certainly has. Eleven years ago, we didn’t even know how many genes humans have. Many estimated that number at 100,000, and some went as high as 300,000. We made a lot of enemies when we claimed that there appeared to be considerably fewer — probably closer to the neighborhood of 40,000! And then we found out that there are only half as many. I was just in Stockholm for the 200th anniversary of the Karolinska Institute. The first presentation was about the many achievements the decoding of the genome has brought. Then I spoke and said that this century will be remembered for how little, and not how much, happened in this field.

Hmmm…I seem to recall that Venter’s company was one that was trying to patent an inflated number of genes, which contradicts what he’s claiming here. But otherwise, yes, the HGP isn’t yet a source of useful medical information, but it’s a trove of scientific information; I’d also add that the technology race put a lot of useful techniques in our hands.

Venter: Exactly. Why did people think there were so many human genes? It’s because they thought there was going to be one gene for each human trait. And if you want to cure greed, you change the greed gene, right? Or the envy gene, which is probably far more dangerous. But it turns out that we’re pretty complex. If you want to find out why someone gets Alzheimer’s or cancer, then it is not enough to look at one gene. To do so, we have to have the whole picture. It’s like saying you want to explore Valencia and the only thing you can see is this table. You see a little rust, but that tells you nothing about Valencia other than that the air is maybe salty. That’s where we are with the genome. We know nothing.

Exactly! Traits are products of overlapping networks of genes. Venter also explains that a lot of the effects of genes are developmental, so you can’t expect to be able to take a pill to correct something that went wrong in the assembly process in the embryo.

Here’s my favorite exchange from the interview.

Venter: Yes, and I find them frightening. I can read your genome, you know? Nobody’s been able to do that in history before. But that is not about God-like powers, it’s about scientific power. The real problem is that the understanding of science in our society is so shallow. In the future, if we want to have enough water, enough food and enough energy without totally destroying our planet, then we will have to be dependent on good science.

SPIEGEL: Some scientist don’t rule out a belief in God. Francis Collins, for example …

Venter: … That’s his issue to reconcile, not mine. For me, it’s either faith or science – you can’t have both.

SPIEGEL: So you don’t consider Collins to be a true scientist?

Venter: Let’s just say he’s a government administrator.

Oh, snap.

It’s more than genes, it’s networks and systems

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Most of you don’t understand evolution. I mean this in the most charitable way; there’s a common conceptual model of how evolution occurs that I find everywhere, and that I particularly find common among bright young students who are just getting enthusiastic about biology. Let me give you the Standard Story, the one that I get all the time from supporters of biology.

Evolution proceeds by mutation and selection. A novel mutation occurs in a gene that gives the individual inheriting it an advantage, and that person passes it on to their children who also gets the advantage and do better than their peers, and leave more offspring. Given time, the advantageous mutation spreads through the population so the entire species has it.

One example is the human brain. An ape man millions of years ago acquired a mutation that made his or her brain slightly larger, and since those individuals were slightly smarter than other ape men, it spread through the population. Then later, other mutations occured and were selected for and so human brains gradually got larger and larger.

You either know what’s wrong here or you’re feeling a little uneasy—I gave you enough hints that you know I’m going to complain about that story, but if your knowledge is at the Evolutionary Biology 101 level, you may not be sure what it is.

Just to make you even more queasy, the misunderstanding here is one that creationists have, too. If you’ve ever encountered the cryptic phrase “RM+NS” (“random mutation + natural selection”) used as a pejorative on a creationist site, you’ve found someone with this affliction. They’ve got it completely wrong.

Here’s the problem, and also a brief introduction to Evolutionary Biology 201.

First, it’s not exactly wrong — it’s more like taking one good explanation of certain kinds of evolution and making it a sweeping claim that that is how all evolution works. By reducing it to this one scheme, though, it makes evolution far too plodding and linear, and reduces it all to a sort of personal narrative. It isn’t any of those things. What’s left out in the 101 story, and in creationist tales, is that: evolution is about populations, so many changes go on in parallel; selectable traits are usually the product of networks of genes, so there are rarely single alleles that can be categorized as the effector of change; and genes and gene networks are plastic or responsive to the environment. All of these complications make the actual story more complicated and interesting, and also, perhaps to your surprise, make evolutionary change faster and more powerful.

Think populations

Mutations are the root of biological variation, of course, but we often have a naive view of their consequences. Most mutations are neutral. Even advantageous mutations are subject to laws of chance in their propagation, and a positive selection coefficient does not mean there will be an inexorable march to fixation, where every individual has the allele. This is also true of deleterious mutations: chance often dominates, and unless it is a strongly negative allele, like an embryonic lethal mutation, there’s also a chance it can spread through the population.

Stop thinking of mutations as unitary events that either get swiftly culled, because they’re deleterious, or get swiftly hauled into prominence by the uplifting crane of natural selection. Mutations are usually negligible changes that get tossed into the stewpot of the gene pool, where they simmer mostly unnoticed and invisible to selection. Look at human faces, for instance: they’re all different, and unless you’re looking at the extremes of beauty or ugliness, the variations simply don’t make much difference. Yet all those different faces really are the result of subtly different combinations of mutant forms of genes.

“Combinations” is the magic word. A single mutation rarely has a significant effect on a feature, but the combination of multiple mutations may have a detectable or even novel effect that can be seen by natural selection. And that’s what’s going on all the time: the population is a huge reservoir of genetic variation, and what we do when we reproduce is sort and mix and generate new combinations that are then tested in the environment.

Compare it to a game of poker. A two of hearts in itself seems to be a pathetic little card, but if it’s part of a flush or a straight or three of a kind, it can produce a winning hand. In the game, it’s not the card itself that has power, it’s its utility in a pattern or combination of other cards. A large population like ours is a great shuffler that is producing millions of new hands every day.

We know that this recombination is essential to the rapid acquisition of new phenotypes. Here are some results from a classic experiment by Waddington. Waddington noted that fruit flies expressed the odd trait of developing four wings (the bithorax phenotype) instead of two if they were exposed to ether early in development. This is not a mutation! This is called a phenocopy, where an environmental factor induces an effect similar to a genetic mutation.

What Waddington did next was to select for individuals that expressed the bithorax phenotype most robustly, or that were better at resisting the ether, and found that he could get a progressive strengthening of the response.

i-b7d71dfe023865cd8212e074b3e018b3-bithorax.jpeg
The progress of selection for or against a bithorax-like response to ether treatment in two wild-type populations. Experiments 1 and 2 initially showed about 25 and 48% of the bithorax (He) phenotype.

This occurred over 10s of generations — far, far too fast for this to be a consequence of the generation of new mutations. What Waddington was doing was selecting for more potent combinations of alleles already extant in the gene pool.

This was confirmed in a cool way with a simple experiment: the results in the graph above were obtained from wild-caught populations. Using highly inbred laboratory strains that have greatly reduced genetic variation abolishes the outcome.

Jonathan Bard sees this as a powerful potential factor in evolution.

Waddington’s results have excited considerable controversy over the years, for example as to whether they reflect threshold effects or hidden variation. In my view, these arguments are irrelevant to the key point: within a population of organisms, there is enough intrinsic variability that, given strong selection pressures, minor but existing variants in a trait that are not normally noticeable can rapidly become the majority phenotype without new mutations. The implications for evolution are obvious: normally silent mutations in a population can lead to adaptation if selection pressures are high enough. This view provides a sensible explanation of the relatively rapid origins of the different beak morphologies of Darwin’s various finches and of species flocks.

Think networks

One question you might have at this point is that the model above suggests that mutations are constantly being thrown into the population’s gene pool and are steadily accumulating — it means that there must be a remarkable amount of genetic variation between individuals (and there is! It’s been measured), yet we generally don’t see most people as weird and obvious mutants. That variation is largely invisible, or represents mere minor variations that we don’t regard as at all remarkable. How can that be?

One important reason is that most traits are not the product of single genes, but of combinations of genes working together in complex ways. The unit producing the phenotype is most often a network of genes and gene products, such at this lovely example of the network supporting expression and regulation of the epidermal growth factor (EGF) pathway.

That is awesomely complex, and yes, if you’re a creationist you’re probably wrongly thinking there is no way that can evolve. The curious thing is, though, that the more elaborate the network, the more pieces tangled into the pathway, the smaller the effect of any individual component (in general, of course). What we find over and over again is that many mutations to any one component may have a completely indetectable effect on the output. The system is buffered to produce a reliable yield.

This is the way networks often work. Consider the internet, for example: a complex network with many components and many different routes to get a single from Point A to Point B. What happens if you take out a single node, or even a set of nodes? The system routes automatically around any damage, without any intelligent agency required to consciously reroute messages.

But further, consider the nature of most mutations in a biological network. Simple knockouts of a whole component are possible, but often what will happen are smaller effects. These gene products are typically enzymes; what happens is a shift in kinetics that will more subtly modify expression. The challenge is to measure and compute these effects.

Graph analysis is showing how networks can be partitioned and analysed, while work on the kinetics of networks has shown first that it is possible to simplify the mathematics of the differential equation models and, second, that the detailed output of a network is relatively insensitive to changes in most of the reaction parameters. What this latter work means is that most gene mutations will have relatively minor effects on the networks in which their proteins are involved, and some will have none, perhaps because they are part of secondary pathways and so redundant under normal circumstances. Indirect evidence for this comes from the surprising observation that many gene knockouts in mice result in an apparently normal phenotype. Within an evolutionary context, it would thus be expected that, across a population of organisms, most
mutations in a network would effectively be silent, in that they would give no selective advantage under normal conditions. It is one of the tasks of systems biologists to understand how and where mutations can lead to sufficient variation in networks properties for selection to have something on which to act.

Combine this with population effects. The population can accumulate many of these sneaky variants that have no significant effect on most individuals, but under conditions of strong selection, combinations of these variants, that together can have detectable effects, can be exposed to selection.

Think flexible genes

Another factor in this process (one that Bard does not touch on) is that the individual genes themselves are not invariant units. Mutations can affect how genes contribute to the network, but in addition, the same allele can have different consequences in different genetic backgrounds — it is affected by the other genes in the network — and also has different consquences in different external environments.

Everything is fluid. Biology isn’t about fixed and rigidly invariant processes — it’s about squishy, dynamic, and interactive stuff making do.

Now do you see what’s wrong with the simplistic caricature of evolution at the top of this article? It’s superficial; it ignores the richness of real biology; it limits and constrains the potential of evolution unrealistically. The concept of evolution as a change in allele frequencies over time is one small part of the whole of evolutionary processes. You’ve got to include network theory and gene and environmental interactions to really understand the phenomena. And the cool thing is that all of these perspectives make evolution an even more powerful force.


Bard J (2010) A systems biology view of evolutionary genetics. Bioessays 32: 559-563.

How not to evaluate a big science program

Nicholas Wade of the NY Times has written one of those stories that make biologists cringe — it just gets so much wrong. It’s a look back at the human genome project, and I was turned off at the first paragraph. The HGP was badly marketed from the very beginning in the sense that there was a misrepresentation of the scientific goals; it was well-marketed if your goal was wringing money out of congress. Unfortunately, now we’ve got to deal with science writers complaining that nobody has generated any miracle cures from all that work. Pay attention to what Harold Varmus said:

“Genomics is a way to do science, not medicine,” said Harold Varmus, president of the Memorial Sloan-Kettering Cancer Center in New York, who in July will become the director of the National Cancer Institute.

The genome is a basic research tool, not a recipe book for curing diseases. I can’t entirely blame Wade for complaining about this, though, since some prominent people like Francis Collins were selling the HGP as the first step in generating a panacea.

But Wade ought to be embarrassed at the rampant linear ladder thinking in his article. Both Jonathan Eisen and Larry Moran take him to task for that — he makes this error-filled statement:

The barely visible roundworm needs 20,000 genes that make proteins, the working parts of cells, whereas humans, apparently so much higher on the evolutionary scale, seem to have only 21,000 protein-coding genes.

Humans aren’t high on the evolutionary scale…there is no evolutionary scale. We aren’t the pinnacle of anything. It’s also weird to see people still expressing astonishment that we “only” have about 20,000 genes. Way, way back in the dim and distant past, when I was a lowly undergraduate in 1977 (AD, I think), my genetics professor, Larry Sandler, lectured to us about how Drosophila was thought to have about 10-15,000 genes and humans might have about twice that…but that when you looked at the C-value paradox (that the quantity of DNA in organisms doesn’t correlate at all well with our perceptions of complexity), it really didn’t mean much, especially since we didn’t (and still don’t) know what most of those genes do. In the early days of the HGP there was a mad flurry of speculation, mostly from people with economic interests in more genes, that there were 100-200,000 genes, but everyone who knew anything about genetics gave those a squinty cynical look.

Apparently, there’s going to be a second article in this series from Wade: “Next: Drug companies stick with genomics but struggle with information overload.” Please. If you want to do a retrospective on the impact of the human genome project, don’t go talking to the drug companies.

Autism and the search for simple, direct answers

I’ve gotten some email asking for a simplified executive summary of this paper, so here it is.

A large study of almost a thousand autistic individuals for genetic variations that make them different from control individuals has found that Autism Spectrum Disorder has many different genetic causes: there isn’t one single gene responsible for ASD, but a constellation of hundreds, each with the potential to affect the development of the brain and cause the symptoms of autism. They don’t know exactly how each of these genes contributes to the disorder, but they have found that many of them are involved in growth and cell communication and the formation of synapses in the brain.

The bottom line is that there are many different ways to cause the symptoms of autism, and it’s a mistake to try to pin it all on single, simple causes. Any hope for amelioration lies in understanding the general functional processes that are disrupted by mutations in various pathways.

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Coming up with simple, one-size-fits-all answers to serious problems is so tempting and so satisfying. Look at autism, for instance: a mysterious disease with a wide range of expression, so wide that it is more properly called Autism Spectrum Disorder (ASD), and the popular press and various celebrities all want it to be pegged to a simple cause: it’s vaccines, or it’s mercury, or it’s the dose of the vaccines, and all we have to do to fix it is not vaccinate, or reduce the number of vaccinations, or use chelation therapy to extract poisons, and presto, a cure! This is magical thinking, pure and simple, and it doesn’t work.

ASD isn’t simple, it’s not one disease, it doesn’t have one cause, and vaccines are definitely not the cause: if there’s one thing the research has done, it’s to thoroughly rule out the idea that giving kids shots at an early age causes autism. What we’re actually discovering more and more is that ASD can be traced to genetic variation.

Again, though, the causes aren’t simple. There is no single mutation to which ASD can be pinned.

For example, one hot spot for an association of genes with autism is the long arm of chromosome 22; cases of developmental delays and autistic behavior have been associated with partial deletions in chromosome 22, and the problems have even been narrowed down to one specific gene, SHANK3, which is expressed in neurons and localized to synapses. We know that if you’ve got a broken copy of this particular gene, you’re likely to have ASD.

How many ASD individuals have this specific genetic change? 0.75%. It is a cause in less than 1% of all affected individuals, but it cannot be the sole cause of ASD in all cases. We have to get out of this mindset that tries to find single causes for complex phenomena; ASD is a case where we have a complex range of disorders with multiple, complex causes.

So how do we get a handle on ASD? This is where the work gets interesting: just because something is multi-causal does not mean that science can’t get a grip on it and that we can’t learn anything interesting about it. We’ve got lots of new tools for analyzing broad properties of genomes now, and one promising line of attack are methods for measuring and identifying copy number variants in individuals and populations.

Copy number variants (CNVs) are surprisingly common. If you’ve had any biology instruction at all, you’re probably familiar with the Mendelian concept that we have two copies of each chromosome, and two copies of each gene. As it turns out, that is an oversimplification: sometimes, a piece of a chromosome is accidentally duplicated, and then you’ll carry two copies of the associated gene on one chromosome, and one copy on another chromosome, for a total of 3 copies. And in some cases, these duplications have occurred often enough that you’ll have many more than 3; the median number of copies of the amylase gene (an enzyme that breaks down starch) in European American populations is 7, with a range of 2 to 15 in different individuals. Get used to it, this kind of variation in copy number seems to happen fairly often.

Now in the case of amylase, the effect of this variation is mild — individuals with more copies of the gene produce more of the enzyme and break down starchy foods faster. It does have evolutionary effects, since cultures with diets rich in starch contain individuals who have, on average, more copies of the gene than individuals where starches are less common in the diet. But what if these chance variations in copy number affect genes involved in the function of the brain? We might see more profound effects on behavior or cognitive ability. The defect in SHANK3 mutations is an example of a reduction in copy number of that gene; what if we could screen populations of ASD individuals not for a specific gene variant, but for the more general occurrence of frequent variations in copy number of any genes…and then we could ask which genes are often affected?

It’s being done. A new paper in Nature describes a screen of control and ASD individuals to identify rare copy number variants associated with autism. It worked! In fact, it worked maybe a little too well, since we now have an embarrassment of riches, a great many genes that may be related to ASD.

The autism spectrum disorders (ASDs) are a group of conditions characterized by impairments in reciprocal social interaction and communication, and the presence of restricted and repetitive behaviours. Individuals with an ASD vary greatly in cognitive development, which can range from above average to intellectual disability. Although ASDs are known to be highly heritable (~90%), the underlying genetic determinants are still largely unknown. Here we analysed the genome-wide characteristics of rare (<1% frequency) copy number variation in ASD using dense genotyping arrays. When comparing 996 ASD individuals of European ancestry to 1,287 matched controls, cases were found to carry a higher global burden of rare, genic copy number variants (CNVs) (1.19 fold, P = 0.012), especially so for loci previously implicated in either ASD and/or intellectual disability (1.69 fold, P = 3.4 × 10-4). Among the CNVs there were numerous de novo and inherited events, sometimes in combination in a given family, implicating many novel ASD genes such as SHANK2, SYNGAP1, DLGAP2 and the X-linked DDX53-PTCHD1 locus. We also discovered an enrichment of CNVs disrupting functional gene sets involved in cellular proliferation, projection and motility, and GTPase/Ras signalling. Our results reveal many new genetic and functional targets in ASD that may lead to final connected pathways.

They analyzed both affected individuals and their parents, and found both familial transmission — that is, the child with ASD had received a copy number variant from a parent who was a carrier — and de novo events — that is, the child had a spontaneous, new mutation that was not present in either parent. There is no one single gene that can be tagged as the cause of autism: they identified 226 de novo and 219 inherited copy number variants in affected individuals. No one individual carries all of these variants, of course — the results tell us that there are many different paths to ASD.

Oh, no, you may be tempted to wail, autism is hundreds of diseases, with even more possible combinations of variants, and every individual is unique — this is no way to get a handle on what’s actually happening to autistic kids! Don’t despair, though, this is just the start. Although there are many genes involved, we can try to ask what all of them have in common functionally. There may be common consequences from all of these different genes, so maybe we can identify the common errors in the process of building a brain that lead to ASD.

Here’s a first stab at puzzling out what these genes do. The genes that have been identified as being deficient in ASD individuals are mapped out by known functions, and what jumps out at you is that the hundreds of specific genes fall into a smaller number of functional categories. Many of them cluster in a few functional roles: cell proliferation (genes that affect the number of cells in a tissues) and cell projection (particularly important in neurons, where cells will extend long processes that project into target regions), and a specific class of cell signaling molecules, RAS-GTPases, which are involved in how cells communicate with one another and are particularly important in synapses, or the linkages between neurons.

i-8d23aed462751aa3822b506f48725d65-asd_map-thumb-425x181-50842.jpeg
(Click for larger image)

Enrichment results were mapped as a network of gene sets (nodes) related by mutual overlap (edges), where the colour (red, blue or yellow) indicates the class of gene set. Node size is proportional to the total number of genes in each set and edge thickness represents the number of overlapping genes between sets. a, Gene sets enriched for deletions are shown (red) with enrichment significance (FDR q-value) represented as a node colour gradient. Groups of functionally related gene sets are circled and labelled (groups, filled green circles; subgroups, dashed line). b, An expanded enrichment map shows the relationship between gene sets enriched in deletions (a) and sets of known ASD/intellectual disability genes. Node colour hue represents the class of gene set (that is, enriched in deletions, red; known disease genes (ASD and/or intellectual disability (ID) genes), blue; enriched only in disease genes, yellow). Edge colour represents the overlap between gene sets enriched in deletions (green), from disease genes to enriched sets (blue), and between sets enriched in deletions and in disease genes or between disease gene-sets only (orange). The major functional groups are highlighted by filled circles (enriched in deletions, green; enriched in ASD/intellectual disability, blue).

The second map above ties the various copy number variants to previously known disease genes involved in ASD, and what catches my eye is the dense cloud of variants associated with central nervous system development. That tells me right there that it is inappropriate to treat ASD as something that is switched on or off by simple causal factors: ASD is the product of long-developing, subtle changes in the growth of the nervous system in embryos and infants.

So the conclusion, as expected, is that ASD is a multi-factorial disorder with a strong genetic component — but definitely not single-locus inheritance, as many different genes are involved.

Our findings provide strong support for the involvement of multiple rare genic CNVs, both genome-wide and at specific loci, in ASD. These findings, similar to those recently described in schizophrenia, suggest that at least some of these ASD CNVs (and the genes that they affect) are under purifying selection. Genes previously implicated in ASD by rare variant findings have pointed to functional themes in ASD pathophysiology. Molecules such as NRXN1, NLGN3/4X and SHANK3, localized presynaptically or at the post-synaptic density (PSD), highlight maturation and function of glutamatergic synapses. Our data reveal that SHANK2, SYNGAP1 and DLGAP2 are new ASD loci that also encode proteins in the PSD. We also found intellectual disability genes to be important in ASD. Furthermore, our functional enrichment map identifies new groups such as GTPase/Ras, effectively expanding both the number and connectivity of modules that may be involved in ASD. The next step will be to relate defects or patterns of alterations in these groups to ASD endophenotypes. The combined identification of higher-penetrance rare variants and new biological pathways, including those identified in this study, may broaden the targets amenable to genetic testing and therapeutic intervention.

There aren’t any simple answers. There are some hints of hope for future treatment, though, in the recognition that there are a few functional modules that are being commonly impaired by these many different genes — it at least focuses the direction of future research in to some narrower domains.

One fact is so obvious that it’s unfortunate I have to mention it: no external agent, such as a vaccine, can generate a consistent pattern of duplication and deletions in an affected individual’s cells. These data say it’s an error to chase down transient environmental agents given relatively late in life to people.


Pinto D et al. (2010) Functional impact of global rare copy number variation in autism spectrum disorders Nature doi:10.1038/nature09146.

Neandertal!

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

You don’t have to tell me, I know I’m late to the party: the news about the draft Neandertal genome sequence was announced last week, and here I am getting around to it just now. In my defense, I did hastily rewrite one of my presentation to include a long section on the new genome information, so at least I was talking about it to a few people. Besides, there is coverage from a genuine expert on Neandertals, John Hawks, and of course Carl Zimmer wrote an excellent summary. All I’m going to do now is fuss over a few things on the edge that interested me.

This was an impressive technical feat. The DNA was extracted from a few bone fragments, and it was grossly degraded: the average size of a piece of DNA was less than 200 base pairs, much of that was chemically degraded, and 95-99% of the DNA extracted was from bacteria, not Neandertal. An immense amount of work was required to filter noise from the signal, to reconstruct and reassemble, and to avoid contamination from modern human DNA. These poor Neandertals had died, had rotted thoroughly, and the bacteria had worked their way into almost every crevice of the bone to chew up the remains. All that was left were a few dead cells in isolated lacunae of the bone; their DNA had been chopped up by their own enzymes, and death and chemistry had come to slowly break them down further.

Don’t hold your breath waiting for the draft genome of Homo erectus. Time is unkind.

We have to appreciate the age of these people, too. The oldest Neandertal fossils are approximately 400,000 years old, and the species went extinct about 30,000 years ago. That’s a good run; as measured by species longevity, Homo sapiens neandertalensis is more successful than Homo sapiens sapiens. We’re going to have to hang in there for another 200,000 years to top them.

The samples taken were from bones found in a cave in Vindija, Croatia. Full sequences were derived from these three individuals, and in addition, some partial sequences were taken from other specimens, including the original type specimen found in the Neander Valley in 1856.

i-23e0eb7849e62ad0a9dbe3ae2a2a58ea-neander_source.jpeg
Samples and sites from which DNA was retrieved. (A) The three bones from Vindija from which Neandertal DNA was sequenced. (B) Map showing the four archaeological sites from which bones were used and their approximate dates (years B.P.).

The three bones used for sequencing were directly dated to 38.1, 44.5, and 44.5 thousand years ago, which puts them on the near end of the Neandertal timeline, and after the likely time of contact between modern humans and Neandertals, which probably occurred about 80,000 years ago, in the Middle East.

Just for reference: these samples are 6-7 times older than the entire earth, as dated by young earth creationists. The span of time just between the youngest and oldest bones used is more than six thousand years old, again, about the same length of time as the YEC universe. Imagine that: we see these bone fragments now as part of a jumble of debris from one site, but they represent differences as great as those between a modern American and an ancient Sumerian. I repeat once again: the religious imagination is paltry and petty compared to the awesome reality.

A significant revelation from this work is the discovery of the signature of interbreeding between modern humans and Neandertals. When those humans first wandered out of the homeland of Africa into the Middle East, they encountered Neandertals already occupying the land…people they would eventually displace, but at least early on there was some sexual activity going on between the two groups, and a small number of human-Neandertal hybrids would have been incorporated into the expanding human population—at least, in that subset that was leaving Africa. Modern European, Asian, and South Pacific populations now contain 1-4% Neandertal DNA. This is really cool; I’m proud to think that I had as a many-times-great grandparent a muscular, beetle-browed big game hunter who trod Ice Age Europe, bringing down mighty mammoths with his spears.

However, it is a small contribution from the Neandertals to our lineage, and it’s not likely that these particular Neandertal genes made a particularly dramatic effect on our ancestors. They didn’t exactly sweep rapidly and decisively through the population; it’s most likely that they are neutral hitch-hikers that surfed the wave of human expansion. Any early matings between an expanding human subpopulation and a receding Neandertal population would have left a few traces in our gene pool that would have been passively hauled up into higher numbers by time and the mere growth of human populations. In a complementary fashion, any human genes injected into the Neandertal pool would have been placed into the bleeding edge of a receding population, and would not have persevered. No uniquely human genes were found in the Neandertals examined, but we can’t judge the preferred direction of the sexual exchanges in these encounters, though, because any hybrids in Neandertal tribes were facing early doom, while hybrids in human tribes were in for a long ride.

Here’s the interesting part of these gene exchanges, though. We can now estimate the ancestral gene sequence, that is, the sequences of genes in the last common ancestor of humans and Neandertals, and we can ask if there are any ‘primitive’ genes that have been completely replaced in modern human populations by a different variant, but Neandertal still retained the ancestral pattern (see the red star in the diagram below). These genes could be a hint to what innovations made us uniquely human and different from Neandertals.

i-eacb5bc2f9cc81fa2e29370680c5e1c5-neander_sweep.jpeg
Selective sweep screen. Schematic illustration of the rationale for the selective sweep screen. For many regions of the genome, the variation within current humans 0 is old enough to include Neandertals (left). Thus, for SNPs in present-day humans, Neandertals often carry the derived -1 allele (blue). However, in genomic regions where an advantageous mutation arises (right, red star) and sweeps to high frequency or fixation in present-day humans, Neandertals will be devoid of derived alleles.

There’s good news and bad news. The bad news is that there aren’t very many of them: a grand total of 78 genes were identified that have a novel form and that have been fixed in the modern human population. That’s not very many, so if you’re an exceptionalist looking for justification of your superiority to our ancestors, you haven’t got much to go on. The good news, though, is that there are only 78 genes! This is a manageable number, and represent some useful hints to genes that would be worth studying in more detail.

One other qualification, though: these are 78 genes that have changes in their coding sequence. There are also several hundred other non-coding, presumably regulatory, sequences that are unique to humans and are fixed throughout our population. To the evo-devo mind, these might actually be the more interesting changes, eventually…but right now, there are some tantalizing prospects in the coding genes to look at.

Some of the genes with novel sequences in humans are DYRK1A, a gene that is present in three copies in Down syndrome individuals and is suspected of playing a role in their mental deficits; NRG3, a gene associated with schizophrenia, and CADPS2 and AUTS2, two genes associated with autism. These are exciting prospects for further study because they have alleles unique and universal to humans and not Neandertals, and also affect the functioning of the brain. However, let’s not get confused about what that means for Neandertals. These are genes that, when broken or modified in modern humans, have consequences on the brain. Neandertals had these same genes, but different forms or alleles of them, which are also different from the mutant forms that cause problems in modern humans. Neandertals did not necessarily have autism, schizophrenia, or the minds of people with Down syndrome! The diseases are just indications that these genes are involved in the nervous system, and the differences in the Neandertal forms almost certainly caused much more subtle effects.

Another gene that has some provocative potential is RUNX2. That’s short for Runt-related transcription factor 2, which should make all the developmental biologists sit up and pay attention. It’s a transcription factor, so it’s a regulator of many other genes, and it’s related to Runt, a well known gene in flies that is important in segmentation. In humans, RUNX2 is a regulator of bone growth, and is a master control switch for patterning bone. In modern humans, defects in this gene lead to a syndrome called cleidocranial dysplasia, in which bones of the skull fuse late, leading to anomalies in the shape of the head, and also causes characteristic defects in the shape of the collar bones and shoulder articulations. These, again, are places where Neandertal and modern humans differ significantly in morphology (and again, Neandertals did not have cleidocranial dysplasia — they had forms of the RUNX2 gene that would have contributed to the specific arrangements of their healthy, normal anatomy).

These are tantalizing hints to how human/Neandertal differences could have arisen—by small changes in a few genes that would have had a fairly extensive scope of effect. Don’t view the many subtle differences between the two as each a consequence of a specific genetic change; a variation in a gene like RUNX2 can lead to coordinated, integrated changes to multiple aspects of the phenotype, in this case, affecting the shape of the skull, the chest, and the shoulders.

This is a marvelous insight into our history, and represents some powerful knowledge we can bring to bear on our understanding of human evolution. The only frustrating thing is that this amazing work has been done in a species on which we can’t, for ethical reasons, do the obvious experiments of creating artificial revertants of sets of genes to the ancestral state — we don’t get to resurrect a Neandertal. With the tools that Pääbo and colleagues have developed, though, perhaps we can start considering some paleogenomics projects to get not just the genomic sequences of modern forms, but of their ancestors as well. I’d like to see the genomic differences between elephants and mastodons, and tigers and sabre-toothed cats…and maybe someday we can think about rebuilding a few extinct species.


Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Z, Gusic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PL, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. (2010) A draft sequence of the Neandertal genome. Science 328(5979):710-22.

How to make a snake

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

First, you start with a lizard.

Really, I’m not joking. Snakes didn’t just appear out of nowhere, nor was there simply some massive cosmic zot of a mutation in some primordial legged ancestor that turned their progeny into slithery limbless serpents. One of the tougher lessons to get across to people is that evolution is not about abrupt transmutations of one form into another, but the gradual accumulation of many changes at the genetic level which are typically buffered and have minimal effects on the phenotype, only rarely expanding into a lineage with a marked difference in morphology.

What this means in a practical sense is that if you take a distinct form of a modern clade, such as the snakes, and you look at a distinctly different form in a related clade, such as the lizards, what you may find is that the differences are resting atop a common suite of genetic changes; that snakes, for instance, are extremes in a range of genetic possibilities that are defined by novel attributes shared by all squamates (squamates being the lizards and snakes together). Lizards are not snakes, but they will have inherited some of the shared genetic differences that enabled snakes to arise from the squamate last common ancestor.

So if you want to know where snakes came from, the right place to start is to look at their nearest cousins, the lizards, and ask what snakes and lizards have in common, that is at the same time different from more distant relatives, like mice, turtles, and people…and then you’ll have an idea of the shared genetic substrate that can make a snake out of a lizard-like early squamate.

Furthermore, one obvious place to look is at the pattern of the Hox genes. Hox genes are primary regulators of the body plan along the length of the animal; they are expressed in overlapping zones that specify morphological regions of the body, such as cervical, thoracic, lumbar, sacral/pelvic, and caudal mesodermal tissues, where, for instance, a thoracic vertebra would have one kind of shape with associated ribs, while lumbar vertebra would have a different shape and no ribs. These identities are set up by which Hox genes are active in the tissue forming the bone. And that’s what makes the Hox genes interesting in this case: where the lizard body plan has a little ribless interruption to form pelvis and hindlimbs, the snake has vertebra and ribs that just keep going and going. There must have been some change in the Hox genes (or their downstream targets) to turn a lizard into a snake.

There are four overlapping sets of Hox genes in tetrapods, named a, b, c, and d. Each set has up to 13 individual genes, where 1 is switched on at the front of the animal and 13 is active way back in the tail. This particular study looked at just the caudal members, 10-13, since those are the genes whose expression patterns straddle the pelvis and so are likely candidates for changes in the evolution of snakes.

Here’s a summary diagram of the morphology and patterns of Hox gene expression in the lizard (left) and snake (right). Let’s see what we can determine about the differences.

i-69cbf55199892732adddc5cd182dd922-nature08789-f4.2-thumb-400x294-42856.jpg
(Click for larger image)

Evolutionary modifications of the posterior Hox system in the whiptail lizard and corn snake. The positions of Hox expression domains along the paraxial mesoderm of whiptail lizard (32-40 somites, left) and corn snake (255-270 somites, right) are represented by black (Hox13), dark grey (Hox12), light grey (Hox11) and white (Hox10) bars, aligned with coloured schemes of the future vertebral column. Colours indicate the different vertebral regions: yellow, cervical; dark blue, thoracic; light blue, lumbar; green, sacral (in lizard) or cloacal (in snake); red, caudal. Hoxc11 and Hoxc12 were not analysed in the whiptail lizard. Note the absence of Hoxa13 and Hoxd13 from the corn snake mesoderm and the absence of Hoxd12 from the snake genome.

The morphology is revealing: snakes and lizards have the same regions, cervical (yellow), thoracic (blue), sacral (or cloacal in the snake, which lacks pelvic structures in most species) in green, and caudal or tail segments (red). The differences are in quantity — snakes make a lot of ribbed thoracic segments — and detail — snakes don’t make a pelvis, usually, but do have specializations in that corresponding area for excretion and reproduction.

Where it really gets interesting is in the expression patterns of the Hox genes, shown with the bars that illustrate the regions where each Hox gene listed is expressed. They are largely similar in snake and lizard, with boundaries of Hox expression that correspond to transitions in the morphology of vertebrae. But there are revealing exceptions.

Compare a10/c10 in the snake and lizard. In the snake, these two genes have broader expression patterns, reaching up into the thoracic region; in the lizard, they are cut off sharply at the sacral boundary. This is interesting because in other vertebrates, the Hox 10 group is known to have the function of suppressing rib formation. Yet there they are, turned on in the posterior portion of the thorax in the snake, where there are ribs all over the place.

In the snake, then, Hox a10 and c10 have lost a portion of their function — they no longer shut down ribs. What is the purpose of the extended domain of a10/c10 expression? It may not have one. A comparison of the sequences of these genes between various species reveals a detectable absence of signs of selection — the reason these genes happen to be active so far anteriorly is because selection has been relaxed, probably because they’ve lost that morphological effect of shutting down ribs. Those big bars are a consequence of simple sloppiness in a system that can afford a little slack.

The next group of Hox genes, the 11 group, are very similar in their expression patterns in the lizard and the snake, and that reflects their specific roles. The 10 group is largely involved in repression of rib formation, but the 11 group is involved in the development of sacrum-specific structures. In birds, for instance, the Hox 11 genes are known to be involved in the development of the cloaca, a structure shared between birds, snakes, and lizards, so perhaps it isn’t surprising that they aren’t subject to quite as much change.

The 13 group has some notable differences: Hox a13 and d13 are mostly shut off in the snake. This is suggestive. The 13 group of Hox genes are the last genes, at the very end of the animal, and one of their proposed functions is to act as a terminator of patterning — turning on the Hox 13 genes starts the process of shutting down the mesoderm, shrinking the pool of tissue available for making body parts, so removing a repressor of mesoderm may promote longer periods of growth, allowing the snake to extend its length further during embryonic development.

So we see a couple of clear correlates at the molecular level for differences in snake and lizard morphology: rib suppression has been lost in the snake Hox 10 group, and the activity of the snake Hox 13 group has been greatly curtailed, which may be part of the process of enabling greater elongation. What are the similarities between snakes and lizards that are also different from other animals?

This was an interesting surprise. There are some differences in Hox gene organization in the squamates as a whole, shared with both snakes and lizards.

i-fa000ad47d905c9af235e410cac479e7-nature08789-f1.2-thumb-400x212-42847.jpg
(Click for larger image)

Genomic organization of the posterior HoxD cluster. Schematic representation of the posterior HoxD cluster (from Evx2 to Hoxd10) in various vertebrate species. A currently accepted phylogenetic tree is shown on the left. The correct relative sizes of predicted exons (black boxes), introns (white or coloured boxes) and intergenic regions (horizontal thick lines) permit direct comparisons (right). Gene names are shown above each box. Colours indicate either a 1.5-fold to 2.0-fold (blue) or a more than 2.0-fold (red) increase in the size of intronic (coloured boxes) or intergenic (coloured lines) regions, in comparison with the chicken reference. Major CNEs are represented by green vertical lines: light green, CNEs conserved in both mammals and sauropsids; dark green, CNEs lost in the corn snake. Gaps in the genomic sequences are indicated by dotted lines. Transposable elements are indicated with asterisks of different colours (blue for DNA transposons; red for retrotransposons).

That’s a diagram of the structure of the chromosome in the neighborhood of the Hox d10-13 genes in various vertebrates. For instance, look at the human and the turtle: the layout of our Hox d genes is vary similar, with 13-12-11-10 laid out with approximately the same distances between them, and furthermore, there are conserved non-coding elements, most likely important pieces of regulatory DNA, that are illustrated in light yellow-reen and dark green vertical bars, and they are the same, too.

In other words, the genes that stake out the locations of pelvic and tail structures in turtles and people are pretty much the same, using the same regulatory apparatus. It must be why they both have such pretty butts.

But now compare those same genes with the squamates, geckos, anoles, slow-worms, and corn snakes. The differences are huge: something happened in the ancestor of the squamates that released this region of the genome from some otherwise highly conserved constraints. We don’t know what, but in general regulation of the Hox genes is complex and tightly interknit, and this order of animals acquired some other as yet unidentified patterning mechanism that opened up this region of genome for wider experimentation.

When these regions are compared in animals like turtles and people and chickens, the genomes reveal signs of purifying selection — that is, mutations here tend to be unsuccessful, and lead to death, failure to propagate, etc., other horrible fates that mean tinkering here is largely unfavorable to fecundity (which makes sense: who wants a mutation expressed in their groinal bits?). In the squamates, the evidence in the genome does not witness to intense selection for their particular arrangement, but instead, of relaxed selection — they are generally more tolerant of variations in the Hox gene complex in this area. What was found in those enlarged intergenic regions is a greater invasion of degenerate DNA sequences: lots of additional retrotransposons, like LINES and SINES, which are all junk DNA.

So squamates have more junk in the genomic trunk, which is not necessarily expressed as an obvious phenotypic difference, but still means that they can more flexibly accommodate genetic variations in this particular area. Which means, in turn, that they have the potential to produce more radical experiments in morphology, like making a snake. The change in Hox gene regulation in the squamate ancestor did not immediately produce a limbless snake, instead it was an enabling mutation that opened the door to novel variations that did not compromise viability.


Di-Po N, Montoya-Burgos JI, Miller H, Pourquie O, Milinkovitch MC, Duboule D (2010) Changes in Hox genes’ structure and function during the evolution of the squamate body plan. Nature 464:99-103.

The presumption of Rick Warren

Rick Warren regularly scribbles up these cloying little messages he calls the Daily Hope — and rather than hope, they offer nothing but trite platitudes and unfounded certainty about a godly purpose that I find extremely discouraging. How can people find this lying tripe uplifting?

God deliberately shaped and formed you to serve him in a way that makes your ministry unique. He carefully mixed the DNA recipe that created you. David praised God for this incredible personal attention to detail God gave in designing each of us: “You made all the delicate, inner parts of my body and knit me together in my mother’s womb. Thank you for making me so wonderfully complex! Your workmanship is marvelous” (Psalm 139:13-14, NLT).

Not only did God shape you before your birth, he planned every day of your life to support his shaping process. David continues, “Every day of my life was recorded in your book. Every moment was laid out before a single day had passed” (Psalm 139:16, NLT).

This means nothing that happens in your life is insignificant. God uses all of it to mold you for your ministry to others, and shape you for your service to him.

This man needs to spend some time doing recombination experiments with fruit flies. They’re simple and revealing. For instance, genes for body and eye color (called yellow and white, respectively) are located close together on the X chromosome of Drosophila. If you cross a female carrier for the yellow body and white eye alleles to a wild type male, you will discover that the male progeny (which inherited a nearly empty Y chromosome from their fathers) reveal the rearrangement of alleles that occurred during the production of the female egg. Most will have inherited one of the non-recombinant X chromosomes from their mother, for example, either a chromosome with two wild-type alleles, so they look wild-type with grayish bodies and red eyes, and others will have inherited an X chromosome with the two mutant alleles, so they’ll have yellow bodies and white eyes. And some will have inherited a chromosome rearranged by recombination events, so they’ll have gray bodies and white eyes, or yellow bodies and red eyes. And of course, if you do lots of crosses, you will get occasional mutations in those genes that produce completely unexpected results.

The important point, though, is that you learn quickly that the distribution of progeny is dictated by chance, not purpose. There is no benign allele sorter who recognizes that white eyes, for instance, are deleterious, and therefore carefully arranges each meiotic division of the egg so that the white allele gets discarded in a polar body. No, it’s random — chance alone “mixes the DNA recipe” for each individual. I am the product of a random assortment of half my father’s genes and half my mother’s genes, as are my brothers and sisters, and we’ve each acquired some deleterious and some advantageous alleles, all by chance. We are all a throw of the dice, or a chance hand dealt from the deck.

What Darwin revealed, and has since been explained in greater detail with our understanding of genetics, is that there is a historical bias: individuals who had the most lucky throws of the dice are more likely to produce offspring with their fortunate distribution of alleles. Again, it’s not because a god shines down upon the lucky, it’s because the lucky acquired an advantage, and that advantage can be propagated into successive generations. Nothing more. No purpose, no intent, no plan required. We look at the distribution of traits in a population, and it fits a chance distribution, sometimes modified by natural selection.

And that’s the way I like it.

I have been dealt a hand by chance, and some of my cards are real stinkers — one side of my family, for instance, has a history of early heart disease. I don’t like the bad luck there, but that it is by chance alone is far more reassuring than the idea that a meddling deity chose to give my father a battery of risk factors that led to his early death, and that he also chose to stick me with some of those, too. If a loving god were actually paying “incredible personal attention to detail”, you’d think there would have been some quality control in spermatogenesis that might have weeded out some of the defective alleles, or more precise matching of sperm and egg to make sure all weaknesses in one were compensated by strengths in the other. This doesn’t happen.

While we have all the flaws concomitant with being children of chance, we also have an advantage: we’re free. There is no cosmic fiddler. There is no domineering father in the sky who has a mission for us, who decreed at our birth that there is something we must do with our lives, who has slotted you into one specific role without your consent. You are not driven by an arbitrary external purpose, and you should find the idea of such a daily dictator of every detail of your existence abhorrent to an extreme.

It’s a real mystery to me why anyone would find the deterministic slave-philosophy of Rick Warren at all appealing or consoling, especially since the evidence all says that it is wrong, as well. There must be something some people find pleasant in surrendering responsibility to an imaginary scapegoat.

Personally, I appreciate the fact that I’m a combination of traits, some lucky and some unlucky, that are mine and not the product of the whims of some puppetmaster. I’ll make of them what I can and what I will, and who I am is my responsibility and to my credit or blame.