Modeling metazoan cell lineages


A while back, I criticized this poorly implemented idea from Paul Nelson of the Discovery Institute, a thing that he claimed was a measure of organismal complexity called Ontogenetic Depth. I was not impressed. The short summary of my complaints:

  • Unworkable idea: There was no explanation about how we could implement and test the idea, and despite promises at the time, Nelson still hasn’t produced his methods.
  • False assertions and confusing examples: He claims that all changes in early lineages are destructive, for instance, which is false.
  • Bad metaphors: He uses a terribly flawed metaphor of a marching band to explain how development works; I’d say that it’s a better example of how development doesn’t occur.
  • No research: Which is really a major shortcoming for a research program, that no research is being done.

Recently, Nature published a paper by Azevedo et al. that superficially might resemble Nelson’s proposal, in that it attempts to quantify the complexity of developing organisms by looking at the pattern within their early lineages. The differences are instructive, though: this paper clearly explains their methodology, presents many of the limitations, and draws mostly reasonable conclusions from the work. It is an interesting paper and contains some good ideas, but has a few flaws of its own, I think. My main objections are that its limitations are even greater than the authors mention, and there are some conclusions that are driven by an adaptationist bias.

First, I’m going to give a bit of background. Some organisms, such as nematodes and ascidians, have a remarkably precise pattern of cell divisions in early development. One can map out the lineage of each cell, division by division, and see the same pattern and the same fate in every individual—the pattern is so stereotyped that each cell can be individually named, as in this pedigree of all of the cells in the first four divisions of the nematode embryo.


There are quite a few invertebrate species that have this kind of determinate cell lineage, but it isn’t the only way to do things. We vertebrates, for instance, tend to have a series of early cell divisions that do not sort out into specific fates, but instead produce a large pool of essentially identical cells. They are then shunted into various tissues by the vagaries of chance, location, and global signals. This is one limitation of the work of Azevedo et al., that it doesn’t seem easily applicable to organisms with less rigid patterns of development. They are using the pattern of cell divisions as a measure of developmental rules, but in us, cell divisions aren’t so clearly linked to developmental decisions.


In nematodes, though, the pattern is sharply defined. Here is a lineage map of the fate of the progeny of one particular cell, named T.ap, which gives rise to a specific population of epidermal cells, neurons, and support cells in the wild type animal. The precise details don’t matter here, but what you can see in the diagram at left is the stereotyped sequence of divisions. T.ap divides once to generate one daughter cell that will eventually make the population of green boxes at the bottom; the other daughter cell also divides in its own pattern that eventually leads to one particular cell that undergoes programmed cell death, the “X”.

For now, all you need to know is that bit: one daughter always goes on to produce green boxes, the other goes on to make one great-great-grandchild that kicks the bucket.


The pattern of divisions is hardwired into these cells. Mutations have been identified that change the cleavages in interestingly stereotyped ways. To the left, for example, is the pedigree of T.ap cells in a mutant called lin14(gf) (lin14 is the gene, the (gf) means it is a gain-of-function mutation: the animal has a particularly potent form of lin14). Look at what happens: T.ap divides into two daughter cells, and the one that usually goes on to make the dead great-great grandchild carries out its program normally. The other daughter cell, though, divides and produces one daughter that is supposed to go on to make the green boxes, but instead, it seems to think it is T.ap—it divides to generate a lineage that is a carbon-copy of the T.ap progeny, right down to producing that dead great-great-grandchild.


It’s as if this one specific sequence of cell divisions and cell fates is an independent module, a programmed series, that is regulated by a relatively simple switch, the lin14 gene. It produces a specific and reiterated set of cell divisions. Produce lots of lin14 in a cell, and it will then go on to automatically produce a subset of progeny like the pedigree to the right.

This is powerful stuff. At least in those animals with strictly defined embryonic cell lineages, reiterated patterns of cell division represent a modular program of development that can be switched on and off in evolution. The lin14(gf) mutant is a dead end, of course, since the animal gets stuck in a rut and never develops past that first larval stage, but we can see other examples of reiterated lineages in nematodes and other animals that are functional… for instance, look at the A5.1 and A5.2 cells in this ascidian lineage.


One way to look at the development of these animals is that they form nested, reiterated sublineages; an animal with many cells and cell types is built by following a relatively small number of rules repeatedly. Azevedo et al. have looked at those rules, and see the repetitions as a mathematical pattern that can be reduced to a shorter algorithmic description. They start with the known pattern of cell divisions and encode it as a series of simple rules, such as “cell X → {neuron, epidermis}”, if a cell divides to form a neuron and an epidermal cell. After encoding this verbose, literal description of the pattern, they then compress it by collapsing equivalent rules until they have a program that produces the same set of cells with a minimal algorithm. Lineages with many reiterated sublineages will compress more readily and yield a smaller number of necessary instructions per final cell, while cases where every division is unique in its outcome will be uncompressable, requiring a unique set of rules for each final cell. The ratio of the minimal number of reduced rules to the total number of cell divisions is therefore a measure of the complexity of the lineage.

The diagram below illustrates a described lineage from the nematode on the left. You can see, for instance, that there are 3 pink cells that do the same thing; they divide to produce a cell of type “neu” and a cell of type “X” by Rules R7, R15, and R16. Since Rules R7, R15, and R16 do the same thing, however, they can be compressed to Reduced Rule RR7 in the diagram to the right. Figure b is a shorter algorithmic description of the pattern in part a, and says that we only need 11 rules, RR0 through RR10, to build the actual distribution.

Example of the calculation of cell lineage complexity. a, The C. elegans ABarapp sublineage gives rise to 18 terminal cells of four different types (open circles): epidermal (Epi), neuron (Neu), structural (Str), and death (X). We begin by describing the cell lineage as a series of 17 rules, one for each cell division (solid circles): R0→{R1,R2}, R1→{R3,R4}, …, R16→{Neu,X}. Solid circles of the same colour indicate equivalent rules, ignoring planes of cell division (for example, R7, R15 and R16). b, The minimum algorithmic description of the ABarapp sublineage consists of 11 reduced rules. Each reduced rule is represented by a solid circle labelled RR0–RR10, with a unique colour matching that of equivalent cell divisions (for example, RR7→{Neu,X} corresponds to the initial rules R7, R15 and R16). The lineage complexity of ABarapp is calculated as the number of reduced rules divided by the total number of cell divisions: C = 11/17 = 65%.

In the animal, there are 17 cell divisions. The number of reduced rules is 11, so the relative complexity of this system is 11/17, or 65%. If the system were uncompressable and each division was unlike all the others, it would require 17 rules to describe 17 divisions, so the complexity would be 100%. All clear? The lower the complexity number, the more repetitious the sequence of cell divisions is. In the example of lin14 above, the lin14(gf) mutant would cause an extreme reduction in the complexity of the lineage.

The authors applied their method to 3 nematode lineages (in C. elegans, Pellioditis marina, and Halicephalobus gingivalis), and one ascidian (Halocynthia roretzi) and got complexity values of 35%, 38%, 33%, and 32%, or a third the complexity of equivalent systems with no reiterated lineages. They also compared the complexity values to random networks. That is, if you generate a lineage on the computer with random bifurcations at each division, you also expect a complexity value less than 100% because sometimes, just by chance, two divisions will produce the same outcome. The real networks were still simpler than the random networks by 26-45%.

This diagram illustrates this idea. The top figure is the actual reduced description for the lineage of Halocynthia roretzi, with a complexity of 32%. The second figure is a random network, generated by a computer with the only constraint being that it produce the same distribution of cell types in the same number of cell divisions; it’s obviously much more complicated.

The simplicity of the ascidian cell lineage. Shortest algorithmic descriptions of three lineages capable of generating the cells in the H. roretzi tissue-restricted stage embryo. a, The real lineage has a complexity of C = 32%. b, A random bifurcation lineage with over twice the complexity of the real one (C = 76%; Fig. 2d). c, The simplest lineage evolved from the H. roretzi lineage by selection for low complexity is approximately half as complex as the real one (C = 17%; Fig. 4d). Solid circles represent the reduced rules required to generate the different terminal cell states (open circles): endoderm (End), epidermis (Epi), mesenchyme (Mes), muscle (Mus), nervous system (Ner), notochord (Not) and undifferentiated (Und).

The third figure, c, is an example of the outcome of a simulation. The simulation is constrained to again produce the same distribution of cells as a final result, but is free to modify the rules within the lineage until the simplest possible rule set is identified. The computer was able to find an alternate set of rules that was half as complex as the observed set. The message is that real pedigrees are much simpler than either the worst possible or a random rule set, but still somewhat more complex than an optimum.

It’s an interesting paper and has the virtue of applying quantitative techniques to the problem of complexity in evolution, but I’m not entirely satisfied. I think there are still some problems here, and I don’t entirely trust the numbers that the authors have generated.

One concern is that the descriptions of the terminal fates of these cells are only approximations. When one cell is described as “neuron” and another is also “neuron”, the fact that they are labeled as identical in the rulesets may be an artifact of incomplete knowledge. Maybe the first cell is “serotonergic neuron”, while the second is “GABAergic neuron”, and future detailed analysis will make the complexity values go up. The values seem to be a lower bound, at best.

The authors do mention that interactions between cells are not incorporated into their models, and that position is only approximated as a place on the rule tree. This is a serious shortcoming; orientation, position, and interactivity with other cells and the environment are vital parts of the developmental story, and there is a genetic bias to their analysis. For instance, look at this lovely structure:


That’s the nematode vulva, and it is assembled from the products of a determinate cell lineage, but it is a signal from one cell (ac, the anchor cell) that localizes the vulva, and it is the orientation and location of the component cells that defines the vulval opening and the associated tissues. These spatial and morphological factors are neglected in the modeling. (By the way, the vulva development story is a wonderful piece of work…I’m going to have to put that on my list of things to write up.)

My strongest objection to the paper, though, is that it is cast as a purely adaptive story. The simplicity of the rule networks is presented as the product of selection for minimal rules in the history of these animals. I’d actually argue the other way, that what has happened is that evolution occurred by amplification of simpler, core modules in development; the building blocks were simple. Their techniques are actually showing the underlying modularity of development, not a process of paring away complexity.

Azevedo RBR, Lohaus R, Braun V, Gumbel M, Umamaheshwar M, Agapow P-M, Houthoofd W, Platzer U, tan Borgonie G, Meinzer H-P, Leroi AM (2005) The simplicity of metazoan cell lineages. Nature 433:152-156.


  1. Second Dan says

    Dude! Ya can’t go putting up-close photos of nematode vulvas right there on my screen – I’m reading this at work!

    Good post. Just because you can measure something doesn’t make it relevant – and that’s even if you can measure it, which you probably can’t. “More complex” could just as easily mean “less efficient” or “poorly derived”, or even a combination of those explanations at an unknown ratio. What’s the value actually representing, after all? Variation in representational systems? Proportion of analagous lineage?

    It’s great to say “here is a thing”. It’s another matter entirely to show us how that thing matters.

  2. says

    I think an interesting thing to demonstrate would be the variety of viable organisms that one can produce given some core development modules. The very simple and very complex amplifications should mostly give rise to unviable organisms.

  3. PaulC says

    My strongest objection to the paper, though, is that it is cast as a purely adaptive story. The simplicity of the rule networks is presented as the product of selection for minimal rules in the history of these animals. I’d actually argue the other way, that what has happened is that evolution occurred by amplification of simpler, core modules in development; the building blocks were simple.

    I’m not a biologist, but that strikes me as a reasonable bet. I don’t see why the ancestor should start out random and simplify adaptively. The ancestor could just as easily start out very highly regular and adaptively become more complex. The fact that there are different cells that all produced Neu and X may suggest that in ancestral organisms there was a larger set of related cells that produced Neu and X but variation eventually left just the “islands” of seemingly unrelated cells.

    My intuition derives from cellular automata, which may not be a perfect metaphor but is probably better than marching bands. The T.ap example struck me as exactly like a puffer, such as one those found in Conway’s Game of Life. Puffers are more common in CA than people might first expect, and they behave as follows: A simple starting pattern is run for a number of steps after which it produces a displaced copy of itself along with some debris. The copy is sufficiently far away to carry out the same generations without being disturbed by the debris. Note that are actually many failed puffers in which the debris catches up. But the point is that it is very common to find strongly periodic systems.

    The reason that puffers exist is not that they’re designed into the rules but because simple patterns tend to grow into larger ones and if they are simply enough, there is a reasonable statistical expectation that simple patterns exist that produce a copy of themselves among the debris. The probability is even higher than getting any particular pattern as a result, since it works like the birthday paradox: the high likelihood that with 30 or so people, two share a birthday. It’s unlikely to find a pattern than generates any particular result, but common to find ones in which the starting pattern matches its result in some way.

    Anyway, I can imagine that in self-reproducing living systems that periodicity of the T.ap variety is quite common for the same reason that puffers are common. But simple periodicity may not have the adaptive advantages of less regular differentation. So over time, one would expect less regular development trees to evolve from more regular ones rather than the other way around. There is no reason to assume that randomness is the default behavior.

  4. says

    it would seem wise to link to the original post (this is from the archives), especially since one of the authors responded to some of PZs points in the comments.

  5. says

    It is something of an axiom – or at least, a well-accepted engineering principle – that the only successful large software systems are those that grew out of successful small software systems. This is to a large extent a statement about the human ability to manage complexity, but I think it also says something about systems in general – those systems that prosper expose the little design mistakes early, when it’s still easy to correct them.

    It occurs to me that here we may be seeing a similar principle in action – the only successful (i.e. living) large, complex organisms are those that evolved from successful smaller organisms. I guess in a sense we already knew this, but it’s nice to see it confirmed from a different angle.

  6. Francis says

    as a failed computer scientist, i was quite interested. the concept of recursion is one of the most powerful tools in programming and is very hard to do well, but very easy to do simply.

    it’s fascinating that these early creatures locked into the concept as a useful tool.

  7. Jamie Hemmer says

    Thnks vry mch fr shrng ths ntrstng pst. I m jst strtng p my wn blg nd ths hs gvn m nsprtn t wht I cn chv.