The Irish Genome


What a curious paper — it’s fine research, and it’s a useful dollop of data, but it’s simultaneously so 21st century and on the edge of being completely trivial. It’s like a tiny shard of the future whipping by on its way to quaintness.

Researchers have for the first time sequenced the genome of an Irishman, a fellow confirmed to be the product of at least 3 generations of fully Irish ancestors.

It’s a good piece of work, another piece in the puzzle of human genomics, but it’s also a little bit odd. I’m always excited to see another organism’s genome sequenced, the first marsupial, the first sea anemone, the first avian, etc., and it’s also become a bit commonplace (oh, another bacterium sequenced…); it’s just weird to see “Irish” announced as a new novel addition to the ranks of sequenced organisms, as if it were Capitella or something. Cool, but a little jarring.

It’s also a genre with limited prospects. If you’re busy sequencing the first Armenian or the first New Guinean or the first Luxembourger, work fast — I can’t quite imagine that most will warrant a publication, except as a formality, as I imagine this paper is. We’re entering the era of personalized genomics, when anyone will be able to get their sequence done for under a thousand dollars. I don’t imagine that a paper titled “Sequencing and analysis of PZ Myers’ human genome” will get published in Nature. But if anyone wants to try, I’ll gladly send them a few cells and my permission.

Anyway, the paper got the sequence of this Irish fella. They identified many unique single nucleotide polymorphisms that may be useful molecular markers of Irish ancestry; a few of the new alleles seem to be associated with diseases like inflammatory bowel and chronic liver problems. They identified a few genes bearing the signature of positive selection. Here are their conclusions:

The first Irish human genome sequence provides insight into the population structure of this branch of the European lineage which has a distinct ancestry from other published genomes. At 11 fold genome coverage approximately 99.3% of the reference genome was covered and more than 3 million SNPs were detected, of which 13% were novel and may include specific markers of Irish ancestry. We provide a novel technique for SNP calling in human genome sequence using haplotype data and validate the imputation of Irish haplotypes using data from the current Human Genome Diversity Panel (HGDP-CEPH). Our analysis has implications for future re-sequencing studies and suggests that relatively low levels of genome coverage, such as that being used by the 1000 genomes project, should provide relatively accurate genotyping data. Using novel variants identified within the study, which are in linkage disequilibrium with already known disease associated SNPs, we illustrate how these novel variants may point towards potential causative risk factors for important diseases. Comparisons with other sequenced human genomes allowed us to address positive selection in the human lineage and to examine the relative contributions of gene function and gene duplication events. Our findings point towards the possible primacy of recent duplication events over gene function as indicative of a genes likelihood of being under positive selection. Overall we demonstrate the utility of generating targeted whole genome sequence data in helping to address general questions of human biology as well as providing data to answer more lineage-restricted questions.

Hey, it’s data. But I think it will be made much more interesting when it acquires more context. One Irish genome doesn’t give us much information on Irish variation. It’s information to complement the 1000 Genomes Project (the Irish study is not part of that bigger project), which intends to take a nice snapshot of human genetic diversity by sampling 100 individuals from each of 10 distinct populations. Then the hard part comes: comparing and analyzing everything.

Oh, and the digging out from all the ethnic jokes that will appear in the comments.


Tong P, Prendergast JGD, Lohan AL, Farrington SM, Cronin S, Friel N, Bradley DG, Hardiman O, Evans A, Wilson JF and Loftus BJ (2010) Sequencing and analysis of an Irish human genome. Genome Biology (in press)