Evolution after Gene Duplication: a review

I’ve just finished reading Evolution after Gene Duplication, which was edited by Katharina Dittmar and David Liberles and released in 2010.  Before I bought the book I struggled in vain to find a review, so thought I would attempt to provide one here.  However, before opining on the book itself, it’s worth a brief look at the value of writing and reading such a book.

In 1970, Susumu Ohno famously dubbed gene duplication “the major force of evolution”.  By his own admission, this statement was made “on the basis of the scant evidence available”; but the technological advances of the subsequent 43 years have provided the means to test some of his revolutionary hypotheses, and extend our knowledge of evolution by gene duplication.  In their book, Dittmar and Liberles catalogue work from a wide range of fields, and at each stage highlight the improvements these studies have made to our understanding of the evolution of gene duplicates.

Evolution after Gene Duplication (Dittmar & Liberles, 2010) is a valuable exploration of the evolution of gene duplicates (Image reproduced with permission from Wiley)

Evolution after Gene Duplication (Dittmar & Liberles, 2010) is a valuable exploration of the evolution of gene duplicates
(Image reproduced with permission from Wiley-Blackwell)

The book starts with chapters that describe in a broad sense the evolution and divergence of gene duplicates, and the factors governing their retention.  It then moves on to the specifics of the mechanistic basis of duplication, before a series of chapters explore how duplication can be studied using mathematical models, phylogenetics and systems biology.  The final three chapters each consider a different case study: birds, plants, and vertebrates.

This may sound like a lot of topics to cover in one book, but far from skimming over their surface, each is treated in enough depth to warrant inclusion.  Indeed, I found one of the best aspects of the book to be that duplication is considered in its full context.  This served to broaden my outlook on gene duplication, and also provided an accessible introduction to the methods used in these studies.

Another valuable addition are the chapters which explore aspects of gene duplication that are often little-considered or taken for granted, such as redundancy (is it ever selectively advantageous?) and the cost of duplication (what is the size and nature of this cost?).  Both of these sections (by Ran Kafri & Tzachi Pilpel and Andreas Wagner, respectively) do an excellent job in laying out the evidence for the competing hypotheses, as well as extending the ideas with their own thoughts.

An outstanding chapter was “Myths and Realities of Gene Duplication”, by Austin Hughes and Robert Friedman.  Here the authors pull no punches:

“The proliferation of numerous ill-founded statistical methods has given rise to a kind of “computer-assisted storytelling” that purports to test hypotheses but in fact does not adequately consider alternatives.”

They challenge some ideas that were first proposed by Ohno forty years ago, and have become almost canonical since, such as “The Polyploidization Obsession” (their term).   Their dissection of the logical basis and evidence for these ideas makes for refreshing and stimulating reading.

If I had one small gripe, it would be each chapter’s retreading of basic concepts, such as the established evolutionary paths for duplicates (retention, subfunctionalization, neofunctionalization & pseudogenization); however, this is unavoidable in any book that assimilates contributions from so many authors in so many fields.  The repeated treatment of these concepts does at least serve as a barometer for each author’s opinion on such integral ideas.

I would highly recommend Evolution after Gene Duplication.  Its breadth of topic and depth of detailed thought make it a valuable book.



Dittmar, K. & Liberles, D. (2010) Evolution after Gene Duplication.  Wiley-Blackwell.

Ohno, S. (1970) Evolution by Gene Duplication.  Springer-Verlag.

Starch gave canine evolution something to chew on

A dog’s dinner can be scraps from the table, a juicy bone or an incredibly unlucky piece of homework.  This omnivory has been present since their domestication around 15,000 years ago, just before humans developed agriculture.  Indeed, it appears that evolving the ability to break down starch (a large, complex molecule) into glucose (much simpler and easier to digest) was a key step in the speciation of dogs from wolves.  So what is the genetic basis of this varied appetite?  And was the evolution of broader gastronomic horizons the cause of domestication, or a consequence of it?

Dogs diverged from wolves roughly 15,000 years ago
(Creative Commons – Author: Dennis Matheson)

These are the questions that Axelsson and colleagues set out to answer.  They sequenced the genomes of 60 dogs from 14 different breeds, as well as 12 wolves to use as a measure of the ancestral state of dogs before their domestication.  Once the sequencing was complete, they pooled all of the dog genomes to make one genome representative of canine diversity (this was done to counter the effects that selective breeding may have had on particular dog breeds).  They then looked for regions in the dog genome that had low diversity (indicative of the action of selection at these sites), and high divergence from wolves (suggestive of a role in dog speciation).

They found 36 regions that had both low diversity and high divergence.  In these 36 regions were a total of 122 genes, 10 of which were involved in digestion.  This gave the authors 10 candidate genes, some or all of which could have driven the adaptation to starch digestion in early dogs.  They found evidence for selection on three: AMY2B, MGAM and SGLT1.

AMY2B is an amylase that breaks down starch into maltose.  Intriguingly, the number of sequencing reads mapping to this gene in dogs was far higher than in wolves, suggestive of a duplication of this gene in dogs (two copies would be expected to double the number of reads, four copies would quadruple it, and so on).  Investigating this further using qPCR (which quantifies the level of expression of a gene in RNA, or the number of copies of a gene in genomic DNA), the authors found between 2 and 15 copies of AMY2B in different dog breeds.  And when they measured AMY2B expression and the ratio of starch to maltose in the blood (a proxy for AMY2B activity), both were 5-20 times higher in dogs than wolves.  AMY2B duplicates have clearly been retained by selection, and this increase in AMY2B copy number is correlated with increased starch digestion.  However, what is not clear is how many of these different copies are contributing to starch digestion in each breed.  There could be slightly elevated expression of each copy, or one or two copies could have massively increased expression; both situations would lead to the increase in overall amylase levels seen by the authors.

Starch was an abundant food source, making the ability to digest it selectively advantageous in early dogs (Creative Commons – Author: Antony Stanley)

Intriguingly, duplication has also been documented in human amylase.  At some point after humans diverged from mice, amylase (which is usually expressed only in the pancreas) duplicated to produce two copies.  One of these copies remained pancreas-specific, while the other specialized to be expressed only in the saliva (Meisler & Ting, 1993).  It would be fascinating to know whether a similar situation of duplication and specialization has occurred in these dog amylases.

The next gene showing evidence of selection was maltase-glucoamylase (MGAM), which breaks down maltose into glucose.  The authors sequenced MGAM in an additional 71 dogs from 38 breeds, and found that 68 of these dogs carried the same allele (gene form).  They also found that expression of MGAM was 12 times higher in dogs than wolves.  That this allele has been preserved in such a large proportion of dog breeds is powerful evidence for its importance in starch digestion.

The final gene implicated in the evolution of starch digestion was SGLT1, which helps transport glucose across the gut wall into the blood.  Again, the authors looked at the sequence of SGLT1 in 71 dogs, and found a high level of divergence from wolves.  However, the evidence for selection on this gene is weaker than the other two for a couple of reasons: firstly, their sequencing covered only one end of the gene, not its entire length; and secondly, they found no difference in expression of SGLT1 in dogs and wolves.

There is one caveat to the expression studies carried out for all three genes, about which the authors are admirably candid:

‘we cannot rule out that diet-induced plasticity contributed to this difference”

In other words, it may have been that they happened to compare some particularly well-fed dogs with some very hungry wolves, and that this caused much of the difference in their levels of gene expression and their maltose:glucose ratio.  But in my opinion, given the strong evidence for selection at AMY2B and MGAM, and the difficulty in standardizing diets across all individuals measured (would you want to tell a wolf what to eat?), this weakness is not fatal to their argument.

Changes in AMY2B and MGAM were selected for during dog domestication (Creative Commons - Author: Steve Guttman)

Changes in AMY2B and MGAM were selected for during dog domestication (Creative Commons – Author: Steve Guttman)

So it appears that selection has acted on multiple stages in the starch digestion pathway.  Fascinatingly, this selection has taken different forms: at the starch-maltose stage (AMY2B) gene duplicates have been selectively retained; whereas at the maltose-glucose stage (MGAM) there have been changes in the sequence and expression of a single gene.

However, one mystery remains: did this selection on starch digestion happen before dogs were domesticated (wolves scavenging at waste dumps), or after their initial domestication (guard/hunting dogs fed starchy waste)?  Perhaps a better question is: does it matter which came first?  A far more interesting issue is how dogs adapted to this new food source, and Axelsson et al have provided a convincing exploration of the genetic basis of this adaptation.


Axelsson, E. et al (2013) The genomic signature of dog domestication reveals adaptation to a starch-rich diet.  Nature advance online publication

Meisler, M.H. and Ting, C-N (1993) The remarkable evolutionary history of the human amylase genes.  Critical Reviews in Oral Biology & Medicine 4: 503-509

The key to great beer? Gene duplication

caledonian_evolutionBrewing beer is a tricky business.  A lot of the ingredients, like the yeast and the hops, don’t go into the finished product and need to be filtered out.  In the case of yeast, this is made much easier by the process of flocculation: once all the sugars in the brew have been consumed during fermentation, the yeast falls out of solution and collects at the bottom of the tank.  The amount of yeast left in the brew affects both the taste and the cost of production: flocculation has to happen at the correct rate to produce a good beer at a good price.  Because of this, artificial selection pressure on flocculation has been very heavy during the domestication of yeast (Saccharomyces cerevisiae) for brewing (Jin and Speers, 1998).

Flocculation is controlled by a family of genes called the flocculins (Jin and Speers, 1998).  During artificial selection, individual flocculin genes could have evolved quickly, as has been documented for different genes in numerous other domestication episodes (for example, the teosinte branched1 gene in maize (Wang et al, 1999)).  Additionally, the family as a whole could have been selected to increase in size through duplication, with each extra duplicate in the family providing more raw material on which selection could act – examples of which are extremely rare.

Darwins_evolutionTo search for increases in gene family size during yeast evolution, Hahn et al (2005) measured the sizes of all gene families shared by five species of yeast.  They then measured the change in size of each gene family along the phylogeny of these five species, and compared this to the change expected by chance, to identify those families that got significantly bigger or smaller during yeast evolution.

They found that the flocculin gene family increased dramatically in size along the branches leading to S. cerevisiae (brewer’s yeast).  While there are 6-11 flocculins in the other species of yeast, S. cerevisiae has 14 flocculins.  This increase in gene family size may be adaptive: selection on flocculation could have caused retention of flocculin duplicates, each of which can specialize in a different aspect of flocculation.

The next step would be to measure the evolutionary rate and function of each member of the flocculin family in S. cerevisiae, to confirm that they have been retained by selection.  However, this study is still a remarkable demonstration of the dramatic changes in gene copy number that can occur in response to strong selection, be it natural or artificial.  As the authors say:

“This is the first example to our knowledge, however, of selection on gene family size being implicated in domestication.”

So the next time you raise a glass, remember the key ingredient!



Jin, Y-L. and Speers, A. (1998) Flocculation of Saccharomyces cerevisiae.  Food Research International 31:421-440.

Hahn, M.W., de Bie, T., Stajich, J., Nguyen, C. and Cristianini, N. (2005) Estimating the tempo and mode of gene family evolution from comparative genomic data.  Genome Research 15:1153-1160

Wang, R-L, Stec, A., Hey, J., Lukens, L. and Doebley, J. (1999) The limits of selection during maize domestication.  Nature 398:236-239

Keratin: another exemplarily evolving gene

Keratin is most familiar as the structural protein of hair – seems innocuous enough, and certainly not something that would be vital to the evolution and diversification of a group as major as the tetrapods (four-legged land animals).  However, a recent paper in Molecular Biology and Evolution reveals just that: not only did the keratin genes diversify and radiate as the different tetrapod taxa evolved and diverged; they were also the basis of the acquisition of traits integral to a life on land.

Keratins are split into two groups: beta keratins which are found only in the sauropsids (reptiles, birds and their fossil ancestors); and alpha keratins which are found in all tetrapods.  These alpha keratins are themselves split into two groups, types I and II: both groups are important in the structure of the cytoplasmic cell network that gives appendages like hairs and beaks their toughness.  The authors took known keratin gene sequences, and used these to search for keratin genes in the sequence data from the genomes of representatives of all tetrapods (frogs, birds, echidnas, marsupial mammals and placental mammals).  Interestingly, all genes were located on one of two gene clusters (which to me points to tandem duplication (when one gene is doubled by mistake during cell replication, with the new copy placed next to the original) being responsible for their duplication).

They then aligned these sequences by similarity and produced a phylogenetic tree to show their evolutionary relationships.  To minimise the number of duplications produced by this process they used the Minimal Early Duplication (MED) model, hence avoiding accusations of their duplication numbers being an artefact of the alignment and treeing process.

They produced 4 significant results.  Firstly, they found that all type I keratins are monophyletic (all members of the group came from one ancestral gene, and the group encompasses all descendants of that ancestor), as are all type II.  This exemplifies beautifully the process of molecular evolution: mistakes during the replication of one ancestral gene produce copies that are subtley different from the original, therefore changing the phenotype they produce in a way beneficial to the organism.

Secondly, the highest rate of keratin diversification was seen between 400 and 200 million years ago, during which time the stem amphibians, reptiles, birds and mammals all evolved.  Combined with functional data suggesting that the keratin genes with the most dramatic radiations function in appendage formation (including one human homologue of a sauropsid keratin gene which reinforces palms and soles), this provides persuasive evidence for the driver of tetrapod keratin gene diversification being their new terrestrial habit.

Thirdly, the biggest diversification of keratins in the mammals was seen in the hair keratins, and orthologues (sequences with the same ancestral sequence, but separated by a speciation event) of the mammalian hair keratins were found in amphibians.  Keratins involved in hair production therefore likely evolved early in tetrapod evolution (in the stem tetrapods, the taxon that existed before the divergence of the tetrapods and is the ancestor of the amphibians as well), rather than later (in the amniotes, a taxon that formed after the split of the amphibians and that is the ancestor of anything with an egg adapted to land) as was previously thought.  Instead of hair keratin genes evolving anew by mutations in the amniotes, they suggest that they evolved from genes already present in the stem tetrapods.  Based on this, the authors suggest an alternative hypothesis regarding the evolution of hair itself: in the words of the authors, “hair may have originated from glandular alpha-keratinized bumps in stem tetrapods”, rather than evolving from sensory appendages (as is currently thought).

And lastly, they found the same genes next to the keratin genes in fish and tetrapods – hence it is likely that the keratin gene cluster was organised before the divergence of fish and tetrapods (with subsequent duplication and gene evolution occurring within the confines of this genomic region).


Vandebergh, W. and Bossuyt, F. (2011) Radiation and diversification of alpha keratins during early vertebrate evolution.  Molecular Biology and Evolution, first published online 31.10.11

The power of gene duplication

Copying may seem like the slacker’s way; but for natural selection, it only increases the work it has to do.  When a gene in an organism is duplicated – as can happen for a single gene, a string of genes, a whole chromosome or even the whole genome – the organism suddenly has new roads of evolutionary possibility open to it.  Before duplication a gene has a hard time evolving, because deviation from its original function is usually harmful; however, after duplication the original copy can continue to carry out the original function, and natural selection can act on the spare copy, leading to the evolution of a new function.

A recent paper in Molecular Biology and Evolution tracked the evolution of the globin genes through vertebrate history, and showed that whole-genome duplication (AKA WGD, where every gene in the organism is replicated) has been integral to the evolution of their diverse functions.

Their main findings were:

1). Two rounds of WGD occurred during the evolution of the early vertebrates

2). The first duplication produced two copies of the globin gene (as well as every other gene in the genome): one copy was the ancestor of the myoglobins and cytoglobins; the other was the ancestor of the haemoglobins

3). The second round of WGD produced two copies of the haemoglobin gene (and all other genes): these two copies specialized into the alpha and beta globins

These findings are important because:

1). While these two rounds of WGD in early vertebrate evolution have been known about for a while, this provides another concrete example of their products, highlighting further the importance of gene duplication in evolution

2). We can pinpoint more specifically the timing of the division of labour between the myoglobins (which evolved to store oxygen in muscles) and the haemoglobins (which evolved to transport oxygen to respiring tissues), which was a crucial evolutionary development

3). We now know that the alpha and beta globins were produced by a second round of WGD, which allowed these copies to specialize into genes that are key to haemoglobin’s affinity for oxygen

The paper is also noteworthy for its sophisticated use of gene sequence comparison to reveal the evidence for two rounds of WGD.  When a genome is replicated twice, four copies of each gene are expected; however, puzzlingly the authors initially found only three copies of the globin gene.  Luckily for them, WGD also preserves the order of genes on the replicated chromosomes; therefore if one gene is missing, but the rest of a chromosome is present, we can infer that the chromosome (and the whole genome in this case) was in fact duplicated, and this gene must have been removed.  The authors therefore looked for regions of chromosomes which were syntenic (similar in gene order and arrangement) with the chromosomal regions containing the three globin genes, and traced the missing fourth globin gene copy to human chromosome 19.  They therefore had more evidence of two rounds of WGD, and showed that one copy of the globin gene had been lost, something that may have been missed had they just scanned the genome for similar sequences.

Overall, this paper is a great example of the importance of gene duplication to the evolution of the globin gene family, as well as to genome organisation and physiology generally.  It also highlights the subtle and powerful analyses made possible by genome sequencing.



Hoffman, F.G., Opazo, J.C. and Storz, J.F. (2011) Whole-genome duplications spurred the functional diversification of the globin gene superfamily in vertebrates.  Molecular Biology and Evolution, Advance Access