How insects and crustaceans molt

I was mildly surprised at the reaction to this cool timelapse video of a molting crab — some people didn’t understand how arthropods work. The only thing to do, of course, is to explain the molting process of insects and crustaceans, called ecdysis.

Let’s go back to the basics first. In the beginning was the epithelium, a continuous sheet of linked cells that envelops multicellular organisms. These are living, dividing, dynamic cells that are flexible, can repair damage to themselves, and represent the boundary between the carefully maintained internal environment of the organism, and the more variable and often hostile external environment. And that’s where the problem lies: living cells are relatively fragile and sensitive, and in particular don’t cope well with drying out. Cells like it wet, yet if you look at insects and people, we live under horrible conditions for living cells, surrounded by dryness and heat and cold.

Our external epithelia have evolved different solutions to this problem of the basic inhospitability of terrestrial life. In us, our bounding epithelia divide frequently, pushing new cells outward. As these cells move, they commit suicide, producing a fibrous protein called keratin which forms dense, matted tangles inside the cells; these cells also build tight protein connections between their neighbors. It is these dead, protein-packed cells that face the outside world, protecting the delicate interior. These cells are steadily worn away and cast off — dandruff flakes, for instance, are sheets of these dead epithelial cells — and new protective cells produced by cell division and pushed up from the inside out to replace them. It’s a good solution that allows for constant growth and flexibility.

Arthropods, on the other hand, start with a similar sheet of living epithelial cells, but do something completely different. Instead of pushing out a continuous column of dying cells, they secrete dense layers of complex chemical compounds that harden into a tough cuticle. The exoskeleton of an insect or crustacean is acellular — the living cells have protected themselves by secreting an initially fluid set of chemicals that harden like epoxy to form a tough protective armor around themselves. We protect ourselves with sheets of leather; arthropods make plates like fiberglass on their outsides.

And there’s the rub. The cuticles of insects do not gradually slough away, replaced steadily by the addition of new material from the inside. They’re mostly fixed and rigid and static. This does have the advantage of providing a solid protective armor and a rigid framework for muscles, but isn’t so great for accommodating growth. Fiberglass isn’t stretchy and flexible!

Here’s a closer look at the structure of the arthropod cuticle.

i-a88f42ffb262a0197ecc68e2a68e00f9-molting-thumb-450x148-51727.jpg

In the diagram on the left, the living epithelium is at the bottom, labeled “epidermis”. Above it are multiple acellular layers called the cuticle made up of substances like chitin and waxes (notice that it is also perforated by pores containing ducts of the glands that secrete the chemical substances, and also places where hairs called setae can dangle into the exterior.

In order to grow, the animal must discard the old cuticle and build a new one from the inside out. In (b), this process begins by peeling away the living epidermal cells from the dead cuticle, creating a gap called the exuvial space, which is filled with a fluid called molting fluid. The cells then begin secreting a new cuticle from underneath, which is initially flexible.

What is poorly shown in these diagrams is that the new cuticle can be larger than the old. What that means is that epithelium inside the old cuticle is wrinkled and convoluted to have a larger surface area. Again, it is soft, not hard, so it can wrinkle up freely to fit. Also, to make room, the molting fluid in (c) is busily digesting the old cuticle from underneath, and the protein components are absorbed and reused to build the new cuticle.

In (d), the new cuticle is nearly fully formed, the old cuticle has been reduced to a thinner rind, and the two are separated by a thin fluid-filled space. Ecdysis, the actual molt, then occurs, and the old cuticle is discarded. Free of its confining shell, the animal inflates itself to extend the wrinkled new cuticle into larger smoothness, and the process of sclerotization, or hardening of the cuticle, begins from the outside in. Tanning agents, like polyphenols are secreted through ducts onto the surface, where they are oxidized into quinones, which trigger chemical reactions that cross-link the various substances of the cuticle into a rigid structure.

If you’ve ever eaten soft-shell crabs, you’ve caught the poor creature just after a molt and before its cuticle has hardened — in large arthropods, it can take several days for the post-molt cuticle to be fully cured. The hardening is also regional. Next time you’re eating a crab leg, notice that the shaft of the limb is rigid and strong and a bit brittle, but it grades into softer, less thickly sclerotized material at the joints called arthrodial membranes, which retains the flexibility of the pre-molt cuticle.

Now go watch the video again, and it should make more sense. What you’re seeing near the end is the crab pulling soft and rubbery limbs out of the shell of its old legs, and then resting as the new cuticle slowly hardens.

Radial tree of life

I use a very pretty radial tree of life diagram fairly often — the last time was in my talk on Friday — and every time I do, people ask where I got it. Here it is: it’s from the David Hillis lab, with this description:

i-2e36bcc0846de8597dabd430ce98cafc-tree_of_life.jpeg

This file can be printed as a wall poster. Printing at least 54″ wide is recommended.
(If you would prefer a simplified version with common names, please see below.)
Blueprint shops and other places with large format printers can print this file for you.
You are welcome to use it for non-commercial educational purposes.
Please cite the source as David M. Hillis, Derrick Zwickl, and Robin Gutell, University of Texas.
About this Tree: This tree is from an analysis of small subunit rRNA sequences sampled
from about 3,000 species from throughout the Tree of Life. The species were chosen based
on their availability, but we attempted to include most of the major groups, sampled
very roughly in proportion to the number of known species in each group (although many
groups remain over- or under-represented). The number of species
represented is approximately the square-root of the number of species thought to exist on Earth
(i.e., three thousand out of an estimated nine million species), or about 0.18% of the 1.7 million
species that have been formally described and named. This tree has been used
in many museum displays and other educational exhibits, and its use for educational purposes
is welcomed.

There’s also a simplified version:

i-c45f00d8b68f5ff488017d61358744dc-simple_tree_of_life.jpeg

Both of those are available as scalable pdfs, so you can zoom in and out to get just the right view, which is very handy.

Autism and the search for simple, direct answers

I’ve gotten some email asking for a simplified executive summary of this paper, so here it is.

A large study of almost a thousand autistic individuals for genetic variations that make them different from control individuals has found that Autism Spectrum Disorder has many different genetic causes: there isn’t one single gene responsible for ASD, but a constellation of hundreds, each with the potential to affect the development of the brain and cause the symptoms of autism. They don’t know exactly how each of these genes contributes to the disorder, but they have found that many of them are involved in growth and cell communication and the formation of synapses in the brain.

The bottom line is that there are many different ways to cause the symptoms of autism, and it’s a mistake to try to pin it all on single, simple causes. Any hope for amelioration lies in understanding the general functional processes that are disrupted by mutations in various pathways.

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Coming up with simple, one-size-fits-all answers to serious problems is so tempting and so satisfying. Look at autism, for instance: a mysterious disease with a wide range of expression, so wide that it is more properly called Autism Spectrum Disorder (ASD), and the popular press and various celebrities all want it to be pegged to a simple cause: it’s vaccines, or it’s mercury, or it’s the dose of the vaccines, and all we have to do to fix it is not vaccinate, or reduce the number of vaccinations, or use chelation therapy to extract poisons, and presto, a cure! This is magical thinking, pure and simple, and it doesn’t work.

ASD isn’t simple, it’s not one disease, it doesn’t have one cause, and vaccines are definitely not the cause: if there’s one thing the research has done, it’s to thoroughly rule out the idea that giving kids shots at an early age causes autism. What we’re actually discovering more and more is that ASD can be traced to genetic variation.

Again, though, the causes aren’t simple. There is no single mutation to which ASD can be pinned.

For example, one hot spot for an association of genes with autism is the long arm of chromosome 22; cases of developmental delays and autistic behavior have been associated with partial deletions in chromosome 22, and the problems have even been narrowed down to one specific gene, SHANK3, which is expressed in neurons and localized to synapses. We know that if you’ve got a broken copy of this particular gene, you’re likely to have ASD.

How many ASD individuals have this specific genetic change? 0.75%. It is a cause in less than 1% of all affected individuals, but it cannot be the sole cause of ASD in all cases. We have to get out of this mindset that tries to find single causes for complex phenomena; ASD is a case where we have a complex range of disorders with multiple, complex causes.

So how do we get a handle on ASD? This is where the work gets interesting: just because something is multi-causal does not mean that science can’t get a grip on it and that we can’t learn anything interesting about it. We’ve got lots of new tools for analyzing broad properties of genomes now, and one promising line of attack are methods for measuring and identifying copy number variants in individuals and populations.

Copy number variants (CNVs) are surprisingly common. If you’ve had any biology instruction at all, you’re probably familiar with the Mendelian concept that we have two copies of each chromosome, and two copies of each gene. As it turns out, that is an oversimplification: sometimes, a piece of a chromosome is accidentally duplicated, and then you’ll carry two copies of the associated gene on one chromosome, and one copy on another chromosome, for a total of 3 copies. And in some cases, these duplications have occurred often enough that you’ll have many more than 3; the median number of copies of the amylase gene (an enzyme that breaks down starch) in European American populations is 7, with a range of 2 to 15 in different individuals. Get used to it, this kind of variation in copy number seems to happen fairly often.

Now in the case of amylase, the effect of this variation is mild — individuals with more copies of the gene produce more of the enzyme and break down starchy foods faster. It does have evolutionary effects, since cultures with diets rich in starch contain individuals who have, on average, more copies of the gene than individuals where starches are less common in the diet. But what if these chance variations in copy number affect genes involved in the function of the brain? We might see more profound effects on behavior or cognitive ability. The defect in SHANK3 mutations is an example of a reduction in copy number of that gene; what if we could screen populations of ASD individuals not for a specific gene variant, but for the more general occurrence of frequent variations in copy number of any genes…and then we could ask which genes are often affected?

It’s being done. A new paper in Nature describes a screen of control and ASD individuals to identify rare copy number variants associated with autism. It worked! In fact, it worked maybe a little too well, since we now have an embarrassment of riches, a great many genes that may be related to ASD.

The autism spectrum disorders (ASDs) are a group of conditions characterized by impairments in reciprocal social interaction and communication, and the presence of restricted and repetitive behaviours. Individuals with an ASD vary greatly in cognitive development, which can range from above average to intellectual disability. Although ASDs are known to be highly heritable (~90%), the underlying genetic determinants are still largely unknown. Here we analysed the genome-wide characteristics of rare (<1% frequency) copy number variation in ASD using dense genotyping arrays. When comparing 996 ASD individuals of European ancestry to 1,287 matched controls, cases were found to carry a higher global burden of rare, genic copy number variants (CNVs) (1.19 fold, P = 0.012), especially so for loci previously implicated in either ASD and/or intellectual disability (1.69 fold, P = 3.4 × 10-4). Among the CNVs there were numerous de novo and inherited events, sometimes in combination in a given family, implicating many novel ASD genes such as SHANK2, SYNGAP1, DLGAP2 and the X-linked DDX53-PTCHD1 locus. We also discovered an enrichment of CNVs disrupting functional gene sets involved in cellular proliferation, projection and motility, and GTPase/Ras signalling. Our results reveal many new genetic and functional targets in ASD that may lead to final connected pathways.

They analyzed both affected individuals and their parents, and found both familial transmission — that is, the child with ASD had received a copy number variant from a parent who was a carrier — and de novo events — that is, the child had a spontaneous, new mutation that was not present in either parent. There is no one single gene that can be tagged as the cause of autism: they identified 226 de novo and 219 inherited copy number variants in affected individuals. No one individual carries all of these variants, of course — the results tell us that there are many different paths to ASD.

Oh, no, you may be tempted to wail, autism is hundreds of diseases, with even more possible combinations of variants, and every individual is unique — this is no way to get a handle on what’s actually happening to autistic kids! Don’t despair, though, this is just the start. Although there are many genes involved, we can try to ask what all of them have in common functionally. There may be common consequences from all of these different genes, so maybe we can identify the common errors in the process of building a brain that lead to ASD.

Here’s a first stab at puzzling out what these genes do. The genes that have been identified as being deficient in ASD individuals are mapped out by known functions, and what jumps out at you is that the hundreds of specific genes fall into a smaller number of functional categories. Many of them cluster in a few functional roles: cell proliferation (genes that affect the number of cells in a tissues) and cell projection (particularly important in neurons, where cells will extend long processes that project into target regions), and a specific class of cell signaling molecules, RAS-GTPases, which are involved in how cells communicate with one another and are particularly important in synapses, or the linkages between neurons.

i-8d23aed462751aa3822b506f48725d65-asd_map-thumb-425x181-50842.jpeg
(Click for larger image)

Enrichment results were mapped as a network of gene sets (nodes) related by mutual overlap (edges), where the colour (red, blue or yellow) indicates the class of gene set. Node size is proportional to the total number of genes in each set and edge thickness represents the number of overlapping genes between sets. a, Gene sets enriched for deletions are shown (red) with enrichment significance (FDR q-value) represented as a node colour gradient. Groups of functionally related gene sets are circled and labelled (groups, filled green circles; subgroups, dashed line). b, An expanded enrichment map shows the relationship between gene sets enriched in deletions (a) and sets of known ASD/intellectual disability genes. Node colour hue represents the class of gene set (that is, enriched in deletions, red; known disease genes (ASD and/or intellectual disability (ID) genes), blue; enriched only in disease genes, yellow). Edge colour represents the overlap between gene sets enriched in deletions (green), from disease genes to enriched sets (blue), and between sets enriched in deletions and in disease genes or between disease gene-sets only (orange). The major functional groups are highlighted by filled circles (enriched in deletions, green; enriched in ASD/intellectual disability, blue).

The second map above ties the various copy number variants to previously known disease genes involved in ASD, and what catches my eye is the dense cloud of variants associated with central nervous system development. That tells me right there that it is inappropriate to treat ASD as something that is switched on or off by simple causal factors: ASD is the product of long-developing, subtle changes in the growth of the nervous system in embryos and infants.

So the conclusion, as expected, is that ASD is a multi-factorial disorder with a strong genetic component — but definitely not single-locus inheritance, as many different genes are involved.

Our findings provide strong support for the involvement of multiple rare genic CNVs, both genome-wide and at specific loci, in ASD. These findings, similar to those recently described in schizophrenia, suggest that at least some of these ASD CNVs (and the genes that they affect) are under purifying selection. Genes previously implicated in ASD by rare variant findings have pointed to functional themes in ASD pathophysiology. Molecules such as NRXN1, NLGN3/4X and SHANK3, localized presynaptically or at the post-synaptic density (PSD), highlight maturation and function of glutamatergic synapses. Our data reveal that SHANK2, SYNGAP1 and DLGAP2 are new ASD loci that also encode proteins in the PSD. We also found intellectual disability genes to be important in ASD. Furthermore, our functional enrichment map identifies new groups such as GTPase/Ras, effectively expanding both the number and connectivity of modules that may be involved in ASD. The next step will be to relate defects or patterns of alterations in these groups to ASD endophenotypes. The combined identification of higher-penetrance rare variants and new biological pathways, including those identified in this study, may broaden the targets amenable to genetic testing and therapeutic intervention.

There aren’t any simple answers. There are some hints of hope for future treatment, though, in the recognition that there are a few functional modules that are being commonly impaired by these many different genes — it at least focuses the direction of future research in to some narrower domains.

One fact is so obvious that it’s unfortunate I have to mention it: no external agent, such as a vaccine, can generate a consistent pattern of duplication and deletions in an affected individual’s cells. These data say it’s an error to chase down transient environmental agents given relatively late in life to people.


Pinto D et al. (2010) Functional impact of global rare copy number variation in autism spectrum disorders Nature doi:10.1038/nature09146.

It’s ALIVE!

Get in the mood for this bit of news, the synthesis of an artificial organism by Craig Venter’s research team.

Here’s the equivalent of that twitching hand of Frankenstein’s monster:

i-f02e362f829f5187cb195d95bc5e2f44-artificial_myc-thumb-425x185-49474.jpeg
i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Those are two colonies of Mycoplasma mycoides, their nucleoids containing entirely synthesized DNA. You can tell because the synthesized DNA contained a lacZ gene for beta-galactosidase, making the pretty blue product. That’s one of the indicators that the artificial chromosome is functioning inside the cell; the DNA was also encoded with recognizable watermarks, and they also used a cell of a different species, M. capricolum, as the host for the DNA.

The experiment involved creating a strand of DNA as specified by a computer in a sequencing machine, and inserting it into a dead cell of M. capricolum, and then watching it revivify and express the artificial markers and the M. mycoides proteins. It really is like bringing the dead back to life.

It was also a lot more difficult than stitching together corpses and zapping it with lightning bolts. The DNA in this cell is over one million bases long, and it all had to be assembled appropriately with a sequencing machine. That was the first tricky part; current machines can’t build DNA strands that long. They could coax sequences about a thousand nucleotides long out of the machines.

Then what they had to do was splice over a thousand of these short pieces into a complete bacterial chromosome. This was done with a combination of enzymatic reactions in a test tube, and in vivo assembly by recombination inside yeast cells. The end result is a circular bacterial chromosome that is, in its sequence, almost entirely the M. mycoides genome…but made from a sequence stored in a computer rather than a parental bacterium.

i-07e40c58a4b3e918cfd76ade697a4e76-artificial_chrom-thumb-425x436-49477.jpeg

Finally, there was one more hurdle to overcome, getting this large loop of DNA into the husk of a cell. These techniques, at least, had been worked out last year in experiments in which they had transplanted natural M. mycoides chromosomes into bacteria.

The end result is a new, functioning, replicating cell. One could argue that it isn’t entirely artificial yet, since the artificial DNA is being placed in a cell of natural origin…but give it time. The turnover of lipids and proteins and such in the cytoplasm in the membrane means that within 30 generations all of the organism will have been effectively replaced, anyway.

It’s a very small cell that has been created — the mycoplasmas have the smallest genomes of any extant cells. It’s not much, but this is a breakthrough comparable to Wöhler’s synthesis of urea. That event was a revelation, because it broke the idea that organic chemicals were somehow special and incapable of synthesis from inorganic molecules. And that led to the establishment of the whole field of organic chemistry, and we all know how big and important that has become to our culture.

Venter’s synthesis of a simple life form is like the synthesis of urea in that it has the potential to lead to some huge new possibilities. Get ready for it.

If the methods described here can be generalized, design, synthesis, assembly, and transplantation of synthetic chromosomes will no longer be a barrier to the progress of synthetic biology. We expect that the cost of DNA synthesis will follow what has happened with DNA sequencing and continue to exponentially decrease. Lower synthesis costs combined with automation will enable broad applications for synthetic genomics.

We should be aware of the limitations right now, though. It was a large undertaking to assemble the 1 million base pair synthetic chromosome for a mycoplasma. If you’re dreaming of using the draft Neandertal sequence to make your own resynthesized caveman, you’re going to have to appreciate the fact that that is a job more than three orders of magnitude greater than building a bacterium. Also keep in mind that the sequence introduced into the bacterium was not exactly as intended, but contained expected small errors that had accumulated during the extended synthesis process.

A single transplant originating from the sMmYCp235 synthetic genome was sequenced. We refer to this strain as M. mycoides JCVI-syn1.0. The sequence matched the intended design with the exception of the known polymorphisms, 8 new single nucleotide polymorphisms, an E. coli transposon insertion, and an 85-bp duplication. The transposon insertion exactly matches the size and sequence of IS1, a transposon in E. coli. It is likely that IS1 infected the 10-kb sub-assembly following its transfer to E. coli. The IS1 insert is flanked by direct repeats of M. mycoides sequence suggesting that it was inserted by a transposition mechanism. The 85-bp duplication is a result of a non-homologous end joining event, which was not detected in our sequence analysis at the 10-kb stage. These two insertions disrupt two genes that are evidently non-essential.

So we aren’t quite at the stage of building novel new multicellular plants or animals — that’s going to be a long way down the road. But it does mean we can expect to be able to build custom bacteria within another generation, I would think, and that they will provide some major new industrial potential.

I know that there are some ethical concerns — Venter also mentions them in the paper — but I’m not personally too worried about them just yet. This cell created is not a monster with ten times the strength of an ordinary cell and the brain of a madman — it’s actually more fragile and contains only genes found in naturally occurring species (and a few harmless markers). When the techniques become economically practical, everyone will be building specialized bacteria to carry out very specific biochemical reactions, and again, they’re going to be poor generalists and aren’t going to be able to compete in survival with natural species that have been honed by a few billion years of selection for fecundity and survivability.

Give it a decade or two, though, and we’ll have all kinds of new capabilities in our hands. The ethical concerns now are a little premature, though, because we have no idea what our children and grandchildren will be able to do with this power. I don’t think Wöhler could have predicted plastics from his discovery, after all: we’re going to have to sit back, enjoy the ride, and watch carefully for new promises and perils as they emerge.


Gibson et al. (2010) Creation of a Bacterial Cell Controlled by a Chemically Synthesized Genome. Science Express.

Lartigue et al. (2009) Creating Bacterial Strains from Genomes That Have Been Cloned and Engineered in Yeast. Science 325:1693-1696.

Venter has done it

We’re hearing the first stirrings of a big story: Craig Venter may have created the first organism with an artificially synthesized genome. Conceptually, building a strand of DNA and inserting it into a cell stripped of its genome is completely unsurprising — of course it will work, a cell is just chemistry — but it is a huge technical accomplishment.

Carl Zimmer has more background. I want to see the paper.

Another volley in the battle

This essay on the accommodationists vs. the ‘new atheists’ gets off to a bad start, I’m afraid, and I had some concern it was going to be another of those fuzzy articles.

There is a new war between science and religion, rising from the ashes of the old one, which ended with the defeat of the anti-evolution forces in the 2005 “intelligent design” trial.

That’s incorrect. The anti-evolutionists have not been defeated — they got smacked in the nose with a rolled-up newspaper, and that’s about it. The creationists are still thriving, and in some places (like Texas) getting even bolder and noisier.

It gets better from there, though. It’s a polite framing of the arguments between the apologists for religion and the opponents of religion, and the author favors the latter.

Neandertal!

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

You don’t have to tell me, I know I’m late to the party: the news about the draft Neandertal genome sequence was announced last week, and here I am getting around to it just now. In my defense, I did hastily rewrite one of my presentation to include a long section on the new genome information, so at least I was talking about it to a few people. Besides, there is coverage from a genuine expert on Neandertals, John Hawks, and of course Carl Zimmer wrote an excellent summary. All I’m going to do now is fuss over a few things on the edge that interested me.

This was an impressive technical feat. The DNA was extracted from a few bone fragments, and it was grossly degraded: the average size of a piece of DNA was less than 200 base pairs, much of that was chemically degraded, and 95-99% of the DNA extracted was from bacteria, not Neandertal. An immense amount of work was required to filter noise from the signal, to reconstruct and reassemble, and to avoid contamination from modern human DNA. These poor Neandertals had died, had rotted thoroughly, and the bacteria had worked their way into almost every crevice of the bone to chew up the remains. All that was left were a few dead cells in isolated lacunae of the bone; their DNA had been chopped up by their own enzymes, and death and chemistry had come to slowly break them down further.

Don’t hold your breath waiting for the draft genome of Homo erectus. Time is unkind.

We have to appreciate the age of these people, too. The oldest Neandertal fossils are approximately 400,000 years old, and the species went extinct about 30,000 years ago. That’s a good run; as measured by species longevity, Homo sapiens neandertalensis is more successful than Homo sapiens sapiens. We’re going to have to hang in there for another 200,000 years to top them.

The samples taken were from bones found in a cave in Vindija, Croatia. Full sequences were derived from these three individuals, and in addition, some partial sequences were taken from other specimens, including the original type specimen found in the Neander Valley in 1856.

i-23e0eb7849e62ad0a9dbe3ae2a2a58ea-neander_source.jpeg
Samples and sites from which DNA was retrieved. (A) The three bones from Vindija from which Neandertal DNA was sequenced. (B) Map showing the four archaeological sites from which bones were used and their approximate dates (years B.P.).

The three bones used for sequencing were directly dated to 38.1, 44.5, and 44.5 thousand years ago, which puts them on the near end of the Neandertal timeline, and after the likely time of contact between modern humans and Neandertals, which probably occurred about 80,000 years ago, in the Middle East.

Just for reference: these samples are 6-7 times older than the entire earth, as dated by young earth creationists. The span of time just between the youngest and oldest bones used is more than six thousand years old, again, about the same length of time as the YEC universe. Imagine that: we see these bone fragments now as part of a jumble of debris from one site, but they represent differences as great as those between a modern American and an ancient Sumerian. I repeat once again: the religious imagination is paltry and petty compared to the awesome reality.

A significant revelation from this work is the discovery of the signature of interbreeding between modern humans and Neandertals. When those humans first wandered out of the homeland of Africa into the Middle East, they encountered Neandertals already occupying the land…people they would eventually displace, but at least early on there was some sexual activity going on between the two groups, and a small number of human-Neandertal hybrids would have been incorporated into the expanding human population—at least, in that subset that was leaving Africa. Modern European, Asian, and South Pacific populations now contain 1-4% Neandertal DNA. This is really cool; I’m proud to think that I had as a many-times-great grandparent a muscular, beetle-browed big game hunter who trod Ice Age Europe, bringing down mighty mammoths with his spears.

However, it is a small contribution from the Neandertals to our lineage, and it’s not likely that these particular Neandertal genes made a particularly dramatic effect on our ancestors. They didn’t exactly sweep rapidly and decisively through the population; it’s most likely that they are neutral hitch-hikers that surfed the wave of human expansion. Any early matings between an expanding human subpopulation and a receding Neandertal population would have left a few traces in our gene pool that would have been passively hauled up into higher numbers by time and the mere growth of human populations. In a complementary fashion, any human genes injected into the Neandertal pool would have been placed into the bleeding edge of a receding population, and would not have persevered. No uniquely human genes were found in the Neandertals examined, but we can’t judge the preferred direction of the sexual exchanges in these encounters, though, because any hybrids in Neandertal tribes were facing early doom, while hybrids in human tribes were in for a long ride.

Here’s the interesting part of these gene exchanges, though. We can now estimate the ancestral gene sequence, that is, the sequences of genes in the last common ancestor of humans and Neandertals, and we can ask if there are any ‘primitive’ genes that have been completely replaced in modern human populations by a different variant, but Neandertal still retained the ancestral pattern (see the red star in the diagram below). These genes could be a hint to what innovations made us uniquely human and different from Neandertals.

i-eacb5bc2f9cc81fa2e29370680c5e1c5-neander_sweep.jpeg
Selective sweep screen. Schematic illustration of the rationale for the selective sweep screen. For many regions of the genome, the variation within current humans 0 is old enough to include Neandertals (left). Thus, for SNPs in present-day humans, Neandertals often carry the derived -1 allele (blue). However, in genomic regions where an advantageous mutation arises (right, red star) and sweeps to high frequency or fixation in present-day humans, Neandertals will be devoid of derived alleles.

There’s good news and bad news. The bad news is that there aren’t very many of them: a grand total of 78 genes were identified that have a novel form and that have been fixed in the modern human population. That’s not very many, so if you’re an exceptionalist looking for justification of your superiority to our ancestors, you haven’t got much to go on. The good news, though, is that there are only 78 genes! This is a manageable number, and represent some useful hints to genes that would be worth studying in more detail.

One other qualification, though: these are 78 genes that have changes in their coding sequence. There are also several hundred other non-coding, presumably regulatory, sequences that are unique to humans and are fixed throughout our population. To the evo-devo mind, these might actually be the more interesting changes, eventually…but right now, there are some tantalizing prospects in the coding genes to look at.

Some of the genes with novel sequences in humans are DYRK1A, a gene that is present in three copies in Down syndrome individuals and is suspected of playing a role in their mental deficits; NRG3, a gene associated with schizophrenia, and CADPS2 and AUTS2, two genes associated with autism. These are exciting prospects for further study because they have alleles unique and universal to humans and not Neandertals, and also affect the functioning of the brain. However, let’s not get confused about what that means for Neandertals. These are genes that, when broken or modified in modern humans, have consequences on the brain. Neandertals had these same genes, but different forms or alleles of them, which are also different from the mutant forms that cause problems in modern humans. Neandertals did not necessarily have autism, schizophrenia, or the minds of people with Down syndrome! The diseases are just indications that these genes are involved in the nervous system, and the differences in the Neandertal forms almost certainly caused much more subtle effects.

Another gene that has some provocative potential is RUNX2. That’s short for Runt-related transcription factor 2, which should make all the developmental biologists sit up and pay attention. It’s a transcription factor, so it’s a regulator of many other genes, and it’s related to Runt, a well known gene in flies that is important in segmentation. In humans, RUNX2 is a regulator of bone growth, and is a master control switch for patterning bone. In modern humans, defects in this gene lead to a syndrome called cleidocranial dysplasia, in which bones of the skull fuse late, leading to anomalies in the shape of the head, and also causes characteristic defects in the shape of the collar bones and shoulder articulations. These, again, are places where Neandertal and modern humans differ significantly in morphology (and again, Neandertals did not have cleidocranial dysplasia — they had forms of the RUNX2 gene that would have contributed to the specific arrangements of their healthy, normal anatomy).

These are tantalizing hints to how human/Neandertal differences could have arisen—by small changes in a few genes that would have had a fairly extensive scope of effect. Don’t view the many subtle differences between the two as each a consequence of a specific genetic change; a variation in a gene like RUNX2 can lead to coordinated, integrated changes to multiple aspects of the phenotype, in this case, affecting the shape of the skull, the chest, and the shoulders.

This is a marvelous insight into our history, and represents some powerful knowledge we can bring to bear on our understanding of human evolution. The only frustrating thing is that this amazing work has been done in a species on which we can’t, for ethical reasons, do the obvious experiments of creating artificial revertants of sets of genes to the ancestral state — we don’t get to resurrect a Neandertal. With the tools that Pääbo and colleagues have developed, though, perhaps we can start considering some paleogenomics projects to get not just the genomic sequences of modern forms, but of their ancestors as well. I’d like to see the genomic differences between elephants and mastodons, and tigers and sabre-toothed cats…and maybe someday we can think about rebuilding a few extinct species.


Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MH, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Z, Gusic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PL, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. (2010) A draft sequence of the Neandertal genome. Science 328(5979):710-22.