An unpaleontological lament for lost molecules and shattered cells and the cruelty of time

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Sometimes, I really hate fossils. I hate them with the passion of a spurned lover, one who is consumed with desire but knows that he will never, ever be satisfied. They drive me mad.

Right now we’re at a point in our technology where we can take a small sample from a living organism and break it down into amazing detail — we can extract every gene, throw them into a computer, and compare them with every other gene that has been similarly sampled. We can look for the scars of evolution, we can analyze and figure out where on the tree of life this cell resides, we can even figure out what local populatons it lived in, who its ancestors bred with, and to a certain extent, what various alleles contributed to its form and physiology. We don’t know everything, but every time someone works out some new detail in a related species, it goes into the databases and presto, the information cascades through every other relative. I’d call it magic, but that would insult the science with cheap understatement.

We can’t do that with most fossils (with some recent exceptions). The cells are gone. Their contents are obliterated — DNA fragmented, dissolved, corrupted, lost. And the farther back in time we go, the less information we have, but the more interesting the problems become.

All organisms are built of cells — they’re like the Lego building blocks of biology, with specific features that snap them together. With Legos, of course, you can build all kinds of different forms: stick them together and build a Lego Triceratops or a Lego T. rex. Different on the outside, different in arrangement, different in pattern, but all fundamentally built of the same kinds of blocks. I can get into the coolness of digging up a Triceratops or a T. rex, but these are all variations on a theme of phylum Chordata, superclass Tetrapoda, and they’re all using the same building blocks, and all the really interesting stuff, the details in the genome that make one morphology different than another, have all been bled out on the sands of time and gnawed by all-devouring bacteria and reduced to at best a non-specific smear of carbon. That makes me frustrated.

Even worse, most familiar fossils are big bony animals — they’re all pretty much the same, deep down. If they’re built of Legos, there are whole other clades of multicellular organisms that are the equivalent of meccano, lincoln logs, Capsela, and tinkertoys. How were they put together? And how did they evolve these different patterns of connections? To know that, we have to go way back into deep time, and look at the unicellular organisms, the cells that first pioneered patterns of interactions and laid down the possible rules of development that enabled big clumsy multicellular to accumulate the bulk that made them more likely to be fossilized. Those pioneers are practically nonexistent in the fossil record.

What prompts my lament for lost cells is this recent amazing discovery: a collection of fossilized multicellular organisms unearthed in Gabon that are 2.1 billion years old. Keep in mind that in comparison, the Cambrian explosion, the event that was the root of familiar animal diversity, was a mere half billion years ago, so these are genuinely ancient. They’re also beautiful.

i-fcc88213964aed91cefd444a1530f14d-gabon_fossils-thumb-400x525-53117.jpg
(Click for larger image)

Samples show a disparity of forms based on: external size and shape characteristics; peripheral radial microfabric (missing in view d); patterns of topographic thickness distribution; general inner structural organization, including occurrence of folds (seen in views b and c) and of a nodular pyrite concretion in the central part of the fossil (absent in views a and b). a, Original specimen. b, Volume rendering in semi-transparency. c, Transverse (axial) two-dimensional section. d, Longitudinal section running close to the estimated central part of the specimen. Scale bars, 5 mm. Specimens from top to bottom: G-FB2-f-mst1.1, G-FB2-f-mst2.1, G-FB2-f-mst3.1, G-FB2-f-mst4.1.

These small, flat, furrowed sheets lived at a kind of temporal boundary, a few hundred million years after a rise in atmospheric oxygen called the Great Oxygenation Event — a crisis in the history of life on earth which occured when the production of oxygen by photosynthetic organisms could no longer be buffered by reacting chemically with minerals, and began to build up in the atmosphere. This was catastrophic for most of the organisms living at that time, which were anaerobic and found oxygen to be a caustic poison. It was an advantage to a subset that adapted to use oxygen as a fuel in chemical reactions, though, so there was also the beginnings of new forms which exploited this newly oxygenated atmosphere. That’s where these mysterious blobs come in; they were found in formations that had a chemical signature indicating the presence of free oxygen.

These were almost certainly colonial organisms that took advantage of the higher concentration of oxygen to build denser mats on top of the sea floor. They probably weren’t true multi-cellular organisms; they were a step up from a colony of bacteria that you might see growing on a petri dish, but with additional molecular features that permitted greater coordination and the development of more elaborate spatial patterning.

We also know that these had to have been very different from organisms that exist now. Those are not animals, they are not plants, they are not fungi — they are something primeval and radically different, organisms that most likely do not have any living descendants. Those are real aliens in the photo above. There is no category in your experience which you can put them into.

It’s what we don’t know that inflames my curiousity. One of the other things that was going on during the Great Oxygenation Event was the steady loss of dissolved iron in the seas — it was all being oxidized, rusted out, and precipitating out, forming geological structures like the banded iron formations. It was also facilitating the preservation of these organisms by pyritizing them — all their soft gooey bits, the whole of creature, were being replaced by fool’s gold, iron pyrite. There are no cells left here. We don’t even know for sure that these are eukaryotic cells; they probably are, indicated by the presence of a sterane chemical signature in the rocks that is characteristic of eukaryotes, but there isn’t even enough fine detail to tell whether there was a nucleus in these cells. It just breaks my heart.

It’s a beautiful tease. We can see that life was exploring the edges of multicellularity over 2 billion years ago, but…the molecular sinews that stitched them together are all gone. The signals and receptors that enabled communication between them are all gone. The genes that drove their growth are all gone. There is nothing left but a blurry crystal-ruptured outline of what once was.

I have to shake an angry fist at you, fossils. I won’t go all Mel Gibson in incoherent rage at you because I like you too much, but still…you taunt me. I want your cells. Nothing less will do.


El Albani A, Bengtson S, Canfield DE, Bekker A, Macchiarelli R, Mazurier A, Hammarlund EU, Boulvais P, Dupuy JJ, Fontaine C, Fürsich FT, Gauthier-Lafaye F, Janvier P, Javaux E, Ossa FO, Pierson-Wickmann AC, Riboulleau A, Sardini P, Vachard D, Whitehouse M, Meunier A. (2010) Large colonial organisms with coordinated growth in oxygenated environments 2.1 Gyr ago. Nature 466(7302):100-4.


Chris Nedin, who should know, does not think these fossils represent multicellular organisms at all — they are fossilized, folded microbial mats. Which is fine by me — 2 billion year old microbial mats are also exceedingly cool, and I still want their cells.

You do know that if you want to know more about anything pre-Cambrian, you should be reading Ediacaran, right?

How insects and crustaceans molt

I was mildly surprised at the reaction to this cool timelapse video of a molting crab — some people didn’t understand how arthropods work. The only thing to do, of course, is to explain the molting process of insects and crustaceans, called ecdysis.

Let’s go back to the basics first. In the beginning was the epithelium, a continuous sheet of linked cells that envelops multicellular organisms. These are living, dividing, dynamic cells that are flexible, can repair damage to themselves, and represent the boundary between the carefully maintained internal environment of the organism, and the more variable and often hostile external environment. And that’s where the problem lies: living cells are relatively fragile and sensitive, and in particular don’t cope well with drying out. Cells like it wet, yet if you look at insects and people, we live under horrible conditions for living cells, surrounded by dryness and heat and cold.

Our external epithelia have evolved different solutions to this problem of the basic inhospitability of terrestrial life. In us, our bounding epithelia divide frequently, pushing new cells outward. As these cells move, they commit suicide, producing a fibrous protein called keratin which forms dense, matted tangles inside the cells; these cells also build tight protein connections between their neighbors. It is these dead, protein-packed cells that face the outside world, protecting the delicate interior. These cells are steadily worn away and cast off — dandruff flakes, for instance, are sheets of these dead epithelial cells — and new protective cells produced by cell division and pushed up from the inside out to replace them. It’s a good solution that allows for constant growth and flexibility.

Arthropods, on the other hand, start with a similar sheet of living epithelial cells, but do something completely different. Instead of pushing out a continuous column of dying cells, they secrete dense layers of complex chemical compounds that harden into a tough cuticle. The exoskeleton of an insect or crustacean is acellular — the living cells have protected themselves by secreting an initially fluid set of chemicals that harden like epoxy to form a tough protective armor around themselves. We protect ourselves with sheets of leather; arthropods make plates like fiberglass on their outsides.

And there’s the rub. The cuticles of insects do not gradually slough away, replaced steadily by the addition of new material from the inside. They’re mostly fixed and rigid and static. This does have the advantage of providing a solid protective armor and a rigid framework for muscles, but isn’t so great for accommodating growth. Fiberglass isn’t stretchy and flexible!

Here’s a closer look at the structure of the arthropod cuticle.

i-a88f42ffb262a0197ecc68e2a68e00f9-molting-thumb-450x148-51727.jpg

In the diagram on the left, the living epithelium is at the bottom, labeled “epidermis”. Above it are multiple acellular layers called the cuticle made up of substances like chitin and waxes (notice that it is also perforated by pores containing ducts of the glands that secrete the chemical substances, and also places where hairs called setae can dangle into the exterior.

In order to grow, the animal must discard the old cuticle and build a new one from the inside out. In (b), this process begins by peeling away the living epidermal cells from the dead cuticle, creating a gap called the exuvial space, which is filled with a fluid called molting fluid. The cells then begin secreting a new cuticle from underneath, which is initially flexible.

What is poorly shown in these diagrams is that the new cuticle can be larger than the old. What that means is that epithelium inside the old cuticle is wrinkled and convoluted to have a larger surface area. Again, it is soft, not hard, so it can wrinkle up freely to fit. Also, to make room, the molting fluid in (c) is busily digesting the old cuticle from underneath, and the protein components are absorbed and reused to build the new cuticle.

In (d), the new cuticle is nearly fully formed, the old cuticle has been reduced to a thinner rind, and the two are separated by a thin fluid-filled space. Ecdysis, the actual molt, then occurs, and the old cuticle is discarded. Free of its confining shell, the animal inflates itself to extend the wrinkled new cuticle into larger smoothness, and the process of sclerotization, or hardening of the cuticle, begins from the outside in. Tanning agents, like polyphenols are secreted through ducts onto the surface, where they are oxidized into quinones, which trigger chemical reactions that cross-link the various substances of the cuticle into a rigid structure.

If you’ve ever eaten soft-shell crabs, you’ve caught the poor creature just after a molt and before its cuticle has hardened — in large arthropods, it can take several days for the post-molt cuticle to be fully cured. The hardening is also regional. Next time you’re eating a crab leg, notice that the shaft of the limb is rigid and strong and a bit brittle, but it grades into softer, less thickly sclerotized material at the joints called arthrodial membranes, which retains the flexibility of the pre-molt cuticle.

Now go watch the video again, and it should make more sense. What you’re seeing near the end is the crab pulling soft and rubbery limbs out of the shell of its old legs, and then resting as the new cuticle slowly hardens.

Radial tree of life

I use a very pretty radial tree of life diagram fairly often — the last time was in my talk on Friday — and every time I do, people ask where I got it. Here it is: it’s from the David Hillis lab, with this description:

i-2e36bcc0846de8597dabd430ce98cafc-tree_of_life.jpeg

This file can be printed as a wall poster. Printing at least 54″ wide is recommended.
(If you would prefer a simplified version with common names, please see below.)
Blueprint shops and other places with large format printers can print this file for you.
You are welcome to use it for non-commercial educational purposes.
Please cite the source as David M. Hillis, Derrick Zwickl, and Robin Gutell, University of Texas.
About this Tree: This tree is from an analysis of small subunit rRNA sequences sampled
from about 3,000 species from throughout the Tree of Life. The species were chosen based
on their availability, but we attempted to include most of the major groups, sampled
very roughly in proportion to the number of known species in each group (although many
groups remain over- or under-represented). The number of species
represented is approximately the square-root of the number of species thought to exist on Earth
(i.e., three thousand out of an estimated nine million species), or about 0.18% of the 1.7 million
species that have been formally described and named. This tree has been used
in many museum displays and other educational exhibits, and its use for educational purposes
is welcomed.

There’s also a simplified version:

i-c45f00d8b68f5ff488017d61358744dc-simple_tree_of_life.jpeg

Both of those are available as scalable pdfs, so you can zoom in and out to get just the right view, which is very handy.

Autism and the search for simple, direct answers

I’ve gotten some email asking for a simplified executive summary of this paper, so here it is.

A large study of almost a thousand autistic individuals for genetic variations that make them different from control individuals has found that Autism Spectrum Disorder has many different genetic causes: there isn’t one single gene responsible for ASD, but a constellation of hundreds, each with the potential to affect the development of the brain and cause the symptoms of autism. They don’t know exactly how each of these genes contributes to the disorder, but they have found that many of them are involved in growth and cell communication and the formation of synapses in the brain.

The bottom line is that there are many different ways to cause the symptoms of autism, and it’s a mistake to try to pin it all on single, simple causes. Any hope for amelioration lies in understanding the general functional processes that are disrupted by mutations in various pathways.

i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Coming up with simple, one-size-fits-all answers to serious problems is so tempting and so satisfying. Look at autism, for instance: a mysterious disease with a wide range of expression, so wide that it is more properly called Autism Spectrum Disorder (ASD), and the popular press and various celebrities all want it to be pegged to a simple cause: it’s vaccines, or it’s mercury, or it’s the dose of the vaccines, and all we have to do to fix it is not vaccinate, or reduce the number of vaccinations, or use chelation therapy to extract poisons, and presto, a cure! This is magical thinking, pure and simple, and it doesn’t work.

ASD isn’t simple, it’s not one disease, it doesn’t have one cause, and vaccines are definitely not the cause: if there’s one thing the research has done, it’s to thoroughly rule out the idea that giving kids shots at an early age causes autism. What we’re actually discovering more and more is that ASD can be traced to genetic variation.

Again, though, the causes aren’t simple. There is no single mutation to which ASD can be pinned.

For example, one hot spot for an association of genes with autism is the long arm of chromosome 22; cases of developmental delays and autistic behavior have been associated with partial deletions in chromosome 22, and the problems have even been narrowed down to one specific gene, SHANK3, which is expressed in neurons and localized to synapses. We know that if you’ve got a broken copy of this particular gene, you’re likely to have ASD.

How many ASD individuals have this specific genetic change? 0.75%. It is a cause in less than 1% of all affected individuals, but it cannot be the sole cause of ASD in all cases. We have to get out of this mindset that tries to find single causes for complex phenomena; ASD is a case where we have a complex range of disorders with multiple, complex causes.

So how do we get a handle on ASD? This is where the work gets interesting: just because something is multi-causal does not mean that science can’t get a grip on it and that we can’t learn anything interesting about it. We’ve got lots of new tools for analyzing broad properties of genomes now, and one promising line of attack are methods for measuring and identifying copy number variants in individuals and populations.

Copy number variants (CNVs) are surprisingly common. If you’ve had any biology instruction at all, you’re probably familiar with the Mendelian concept that we have two copies of each chromosome, and two copies of each gene. As it turns out, that is an oversimplification: sometimes, a piece of a chromosome is accidentally duplicated, and then you’ll carry two copies of the associated gene on one chromosome, and one copy on another chromosome, for a total of 3 copies. And in some cases, these duplications have occurred often enough that you’ll have many more than 3; the median number of copies of the amylase gene (an enzyme that breaks down starch) in European American populations is 7, with a range of 2 to 15 in different individuals. Get used to it, this kind of variation in copy number seems to happen fairly often.

Now in the case of amylase, the effect of this variation is mild — individuals with more copies of the gene produce more of the enzyme and break down starchy foods faster. It does have evolutionary effects, since cultures with diets rich in starch contain individuals who have, on average, more copies of the gene than individuals where starches are less common in the diet. But what if these chance variations in copy number affect genes involved in the function of the brain? We might see more profound effects on behavior or cognitive ability. The defect in SHANK3 mutations is an example of a reduction in copy number of that gene; what if we could screen populations of ASD individuals not for a specific gene variant, but for the more general occurrence of frequent variations in copy number of any genes…and then we could ask which genes are often affected?

It’s being done. A new paper in Nature describes a screen of control and ASD individuals to identify rare copy number variants associated with autism. It worked! In fact, it worked maybe a little too well, since we now have an embarrassment of riches, a great many genes that may be related to ASD.

The autism spectrum disorders (ASDs) are a group of conditions characterized by impairments in reciprocal social interaction and communication, and the presence of restricted and repetitive behaviours. Individuals with an ASD vary greatly in cognitive development, which can range from above average to intellectual disability. Although ASDs are known to be highly heritable (~90%), the underlying genetic determinants are still largely unknown. Here we analysed the genome-wide characteristics of rare (<1% frequency) copy number variation in ASD using dense genotyping arrays. When comparing 996 ASD individuals of European ancestry to 1,287 matched controls, cases were found to carry a higher global burden of rare, genic copy number variants (CNVs) (1.19 fold, P = 0.012), especially so for loci previously implicated in either ASD and/or intellectual disability (1.69 fold, P = 3.4 × 10-4). Among the CNVs there were numerous de novo and inherited events, sometimes in combination in a given family, implicating many novel ASD genes such as SHANK2, SYNGAP1, DLGAP2 and the X-linked DDX53-PTCHD1 locus. We also discovered an enrichment of CNVs disrupting functional gene sets involved in cellular proliferation, projection and motility, and GTPase/Ras signalling. Our results reveal many new genetic and functional targets in ASD that may lead to final connected pathways.

They analyzed both affected individuals and their parents, and found both familial transmission — that is, the child with ASD had received a copy number variant from a parent who was a carrier — and de novo events — that is, the child had a spontaneous, new mutation that was not present in either parent. There is no one single gene that can be tagged as the cause of autism: they identified 226 de novo and 219 inherited copy number variants in affected individuals. No one individual carries all of these variants, of course — the results tell us that there are many different paths to ASD.

Oh, no, you may be tempted to wail, autism is hundreds of diseases, with even more possible combinations of variants, and every individual is unique — this is no way to get a handle on what’s actually happening to autistic kids! Don’t despair, though, this is just the start. Although there are many genes involved, we can try to ask what all of them have in common functionally. There may be common consequences from all of these different genes, so maybe we can identify the common errors in the process of building a brain that lead to ASD.

Here’s a first stab at puzzling out what these genes do. The genes that have been identified as being deficient in ASD individuals are mapped out by known functions, and what jumps out at you is that the hundreds of specific genes fall into a smaller number of functional categories. Many of them cluster in a few functional roles: cell proliferation (genes that affect the number of cells in a tissues) and cell projection (particularly important in neurons, where cells will extend long processes that project into target regions), and a specific class of cell signaling molecules, RAS-GTPases, which are involved in how cells communicate with one another and are particularly important in synapses, or the linkages between neurons.

i-8d23aed462751aa3822b506f48725d65-asd_map-thumb-425x181-50842.jpeg
(Click for larger image)

Enrichment results were mapped as a network of gene sets (nodes) related by mutual overlap (edges), where the colour (red, blue or yellow) indicates the class of gene set. Node size is proportional to the total number of genes in each set and edge thickness represents the number of overlapping genes between sets. a, Gene sets enriched for deletions are shown (red) with enrichment significance (FDR q-value) represented as a node colour gradient. Groups of functionally related gene sets are circled and labelled (groups, filled green circles; subgroups, dashed line). b, An expanded enrichment map shows the relationship between gene sets enriched in deletions (a) and sets of known ASD/intellectual disability genes. Node colour hue represents the class of gene set (that is, enriched in deletions, red; known disease genes (ASD and/or intellectual disability (ID) genes), blue; enriched only in disease genes, yellow). Edge colour represents the overlap between gene sets enriched in deletions (green), from disease genes to enriched sets (blue), and between sets enriched in deletions and in disease genes or between disease gene-sets only (orange). The major functional groups are highlighted by filled circles (enriched in deletions, green; enriched in ASD/intellectual disability, blue).

The second map above ties the various copy number variants to previously known disease genes involved in ASD, and what catches my eye is the dense cloud of variants associated with central nervous system development. That tells me right there that it is inappropriate to treat ASD as something that is switched on or off by simple causal factors: ASD is the product of long-developing, subtle changes in the growth of the nervous system in embryos and infants.

So the conclusion, as expected, is that ASD is a multi-factorial disorder with a strong genetic component — but definitely not single-locus inheritance, as many different genes are involved.

Our findings provide strong support for the involvement of multiple rare genic CNVs, both genome-wide and at specific loci, in ASD. These findings, similar to those recently described in schizophrenia, suggest that at least some of these ASD CNVs (and the genes that they affect) are under purifying selection. Genes previously implicated in ASD by rare variant findings have pointed to functional themes in ASD pathophysiology. Molecules such as NRXN1, NLGN3/4X and SHANK3, localized presynaptically or at the post-synaptic density (PSD), highlight maturation and function of glutamatergic synapses. Our data reveal that SHANK2, SYNGAP1 and DLGAP2 are new ASD loci that also encode proteins in the PSD. We also found intellectual disability genes to be important in ASD. Furthermore, our functional enrichment map identifies new groups such as GTPase/Ras, effectively expanding both the number and connectivity of modules that may be involved in ASD. The next step will be to relate defects or patterns of alterations in these groups to ASD endophenotypes. The combined identification of higher-penetrance rare variants and new biological pathways, including those identified in this study, may broaden the targets amenable to genetic testing and therapeutic intervention.

There aren’t any simple answers. There are some hints of hope for future treatment, though, in the recognition that there are a few functional modules that are being commonly impaired by these many different genes — it at least focuses the direction of future research in to some narrower domains.

One fact is so obvious that it’s unfortunate I have to mention it: no external agent, such as a vaccine, can generate a consistent pattern of duplication and deletions in an affected individual’s cells. These data say it’s an error to chase down transient environmental agents given relatively late in life to people.


Pinto D et al. (2010) Functional impact of global rare copy number variation in autism spectrum disorders Nature doi:10.1038/nature09146.

It’s ALIVE!

Get in the mood for this bit of news, the synthesis of an artificial organism by Craig Venter’s research team.

Here’s the equivalent of that twitching hand of Frankenstein’s monster:

i-f02e362f829f5187cb195d95bc5e2f44-artificial_myc-thumb-425x185-49474.jpeg
i-e88a953e59c2ce6c5e2ac4568c7f0c36-rb.png

Those are two colonies of Mycoplasma mycoides, their nucleoids containing entirely synthesized DNA. You can tell because the synthesized DNA contained a lacZ gene for beta-galactosidase, making the pretty blue product. That’s one of the indicators that the artificial chromosome is functioning inside the cell; the DNA was also encoded with recognizable watermarks, and they also used a cell of a different species, M. capricolum, as the host for the DNA.

The experiment involved creating a strand of DNA as specified by a computer in a sequencing machine, and inserting it into a dead cell of M. capricolum, and then watching it revivify and express the artificial markers and the M. mycoides proteins. It really is like bringing the dead back to life.

It was also a lot more difficult than stitching together corpses and zapping it with lightning bolts. The DNA in this cell is over one million bases long, and it all had to be assembled appropriately with a sequencing machine. That was the first tricky part; current machines can’t build DNA strands that long. They could coax sequences about a thousand nucleotides long out of the machines.

Then what they had to do was splice over a thousand of these short pieces into a complete bacterial chromosome. This was done with a combination of enzymatic reactions in a test tube, and in vivo assembly by recombination inside yeast cells. The end result is a circular bacterial chromosome that is, in its sequence, almost entirely the M. mycoides genome…but made from a sequence stored in a computer rather than a parental bacterium.

i-07e40c58a4b3e918cfd76ade697a4e76-artificial_chrom-thumb-425x436-49477.jpeg

Finally, there was one more hurdle to overcome, getting this large loop of DNA into the husk of a cell. These techniques, at least, had been worked out last year in experiments in which they had transplanted natural M. mycoides chromosomes into bacteria.

The end result is a new, functioning, replicating cell. One could argue that it isn’t entirely artificial yet, since the artificial DNA is being placed in a cell of natural origin…but give it time. The turnover of lipids and proteins and such in the cytoplasm in the membrane means that within 30 generations all of the organism will have been effectively replaced, anyway.

It’s a very small cell that has been created — the mycoplasmas have the smallest genomes of any extant cells. It’s not much, but this is a breakthrough comparable to Wöhler’s synthesis of urea. That event was a revelation, because it broke the idea that organic chemicals were somehow special and incapable of synthesis from inorganic molecules. And that led to the establishment of the whole field of organic chemistry, and we all know how big and important that has become to our culture.

Venter’s synthesis of a simple life form is like the synthesis of urea in that it has the potential to lead to some huge new possibilities. Get ready for it.

If the methods described here can be generalized, design, synthesis, assembly, and transplantation of synthetic chromosomes will no longer be a barrier to the progress of synthetic biology. We expect that the cost of DNA synthesis will follow what has happened with DNA sequencing and continue to exponentially decrease. Lower synthesis costs combined with automation will enable broad applications for synthetic genomics.

We should be aware of the limitations right now, though. It was a large undertaking to assemble the 1 million base pair synthetic chromosome for a mycoplasma. If you’re dreaming of using the draft Neandertal sequence to make your own resynthesized caveman, you’re going to have to appreciate the fact that that is a job more than three orders of magnitude greater than building a bacterium. Also keep in mind that the sequence introduced into the bacterium was not exactly as intended, but contained expected small errors that had accumulated during the extended synthesis process.

A single transplant originating from the sMmYCp235 synthetic genome was sequenced. We refer to this strain as M. mycoides JCVI-syn1.0. The sequence matched the intended design with the exception of the known polymorphisms, 8 new single nucleotide polymorphisms, an E. coli transposon insertion, and an 85-bp duplication. The transposon insertion exactly matches the size and sequence of IS1, a transposon in E. coli. It is likely that IS1 infected the 10-kb sub-assembly following its transfer to E. coli. The IS1 insert is flanked by direct repeats of M. mycoides sequence suggesting that it was inserted by a transposition mechanism. The 85-bp duplication is a result of a non-homologous end joining event, which was not detected in our sequence analysis at the 10-kb stage. These two insertions disrupt two genes that are evidently non-essential.

So we aren’t quite at the stage of building novel new multicellular plants or animals — that’s going to be a long way down the road. But it does mean we can expect to be able to build custom bacteria within another generation, I would think, and that they will provide some major new industrial potential.

I know that there are some ethical concerns — Venter also mentions them in the paper — but I’m not personally too worried about them just yet. This cell created is not a monster with ten times the strength of an ordinary cell and the brain of a madman — it’s actually more fragile and contains only genes found in naturally occurring species (and a few harmless markers). When the techniques become economically practical, everyone will be building specialized bacteria to carry out very specific biochemical reactions, and again, they’re going to be poor generalists and aren’t going to be able to compete in survival with natural species that have been honed by a few billion years of selection for fecundity and survivability.

Give it a decade or two, though, and we’ll have all kinds of new capabilities in our hands. The ethical concerns now are a little premature, though, because we have no idea what our children and grandchildren will be able to do with this power. I don’t think Wöhler could have predicted plastics from his discovery, after all: we’re going to have to sit back, enjoy the ride, and watch carefully for new promises and perils as they emerge.


Gibson et al. (2010) Creation of a Bacterial Cell Controlled by a Chemically Synthesized Genome. Science Express.

Lartigue et al. (2009) Creating Bacterial Strains from Genomes That Have Been Cloned and Engineered in Yeast. Science 325:1693-1696.

Venter has done it

We’re hearing the first stirrings of a big story: Craig Venter may have created the first organism with an artificially synthesized genome. Conceptually, building a strand of DNA and inserting it into a cell stripped of its genome is completely unsurprising — of course it will work, a cell is just chemistry — but it is a huge technical accomplishment.

Carl Zimmer has more background. I want to see the paper.

Another volley in the battle

This essay on the accommodationists vs. the ‘new atheists’ gets off to a bad start, I’m afraid, and I had some concern it was going to be another of those fuzzy articles.

There is a new war between science and religion, rising from the ashes of the old one, which ended with the defeat of the anti-evolution forces in the 2005 “intelligent design” trial.

That’s incorrect. The anti-evolutionists have not been defeated — they got smacked in the nose with a rolled-up newspaper, and that’s about it. The creationists are still thriving, and in some places (like Texas) getting even bolder and noisier.

It gets better from there, though. It’s a polite framing of the arguments between the apologists for religion and the opponents of religion, and the author favors the latter.