The key to great beer? Gene duplication

caledonian_evolutionBrewing beer is a tricky business.  A lot of the ingredients, like the yeast and the hops, don’t go into the finished product and need to be filtered out.  In the case of yeast, this is made much easier by the process of flocculation: once all the sugars in the brew have been consumed during fermentation, the yeast falls out of solution and collects at the bottom of the tank.  The amount of yeast left in the brew affects both the taste and the cost of production: flocculation has to happen at the correct rate to produce a good beer at a good price.  Because of this, artificial selection pressure on flocculation has been very heavy during the domestication of yeast (Saccharomyces cerevisiae) for brewing (Jin and Speers, 1998).

Flocculation is controlled by a family of genes called the flocculins (Jin and Speers, 1998).  During artificial selection, individual flocculin genes could have evolved quickly, as has been documented for different genes in numerous other domestication episodes (for example, the teosinte branched1 gene in maize (Wang et al, 1999)).  Additionally, the family as a whole could have been selected to increase in size through duplication, with each extra duplicate in the family providing more raw material on which selection could act – examples of which are extremely rare.

Darwins_evolutionTo search for increases in gene family size during yeast evolution, Hahn et al (2005) measured the sizes of all gene families shared by five species of yeast.  They then measured the change in size of each gene family along the phylogeny of these five species, and compared this to the change expected by chance, to identify those families that got significantly bigger or smaller during yeast evolution.

They found that the flocculin gene family increased dramatically in size along the branches leading to S. cerevisiae (brewer’s yeast).  While there are 6-11 flocculins in the other species of yeast, S. cerevisiae has 14 flocculins.  This increase in gene family size may be adaptive: selection on flocculation could have caused retention of flocculin duplicates, each of which can specialize in a different aspect of flocculation.

The next step would be to measure the evolutionary rate and function of each member of the flocculin family in S. cerevisiae, to confirm that they have been retained by selection.  However, this study is still a remarkable demonstration of the dramatic changes in gene copy number that can occur in response to strong selection, be it natural or artificial.  As the authors say:

“This is the first example to our knowledge, however, of selection on gene family size being implicated in domestication.”

So the next time you raise a glass, remember the key ingredient!



Jin, Y-L. and Speers, A. (1998) Flocculation of Saccharomyces cerevisiae.  Food Research International 31:421-440.

Hahn, M.W., de Bie, T., Stajich, J., Nguyen, C. and Cristianini, N. (2005) Estimating the tempo and mode of gene family evolution from comparative genomic data.  Genome Research 15:1153-1160

Wang, R-L, Stec, A., Hey, J., Lukens, L. and Doebley, J. (1999) The limits of selection during maize domestication.  Nature 398:236-239

Arms race gains a third layer of counteradaptive complexity

The RNAi pathway (covered before here) is a major immune mechanism in plants and invertebrates, but it can be suppressed by the very viruses it is trying to stop.  The virus produces an RNA Silencing Suppressor (RSS), a protein that binds or degrades proteins in the RNAi pathway, giving the virus free reign to replicate as it pleases.  This has the potential to create a fast-moving arms race: host suppresses virus using RNAi, so virus suppresses host suppression mechanism, so host suppresses the viral suppressor of host suppression…

Only the first two steps in this cycle have been found so far.  The RNAi machinery has been extensively documented, and many viral RSSs have been catalogued.  However, a recent paper in PNAS by Nakahara et al may have found the first instance of the third step in the cycle: a host suppressor of RSSs.

The authors focused on rgs-Cam, a tobacco plant protein that has previously been observed to interact with viral proteins (Anandalakshmi et al, 2000).  They found that it binds RSSs with greater affinity if they have an arginine-rich domain.  This domain is also what the RSS uses to bind small interfering RNAs (a constituent of the RNAi pathway), meaning that rgs-Cam may be specifically targeting only those proteins that suppress RNAi.

To test its effect on the activity of RSSs, the authors depleted rgs-Cam in tobacco (ironically by using RNAi to knock it down).  This led to increased suppression of RNAi by two RSSs (2b and HC-Pro).  They then created transgenic plants with either increased or decreased levels of rgs-Cam.   They found that those with more rgs-Cam had reduced RSS activity and were less susceptible to viral attack, whereas those with less rgs-Cam had increased RSS activity and were more susceptible to attack by viruses.

So it looked like rgs-Cam suppressed RSSs, and therefore restores the RNAi response.  But how?  The authors inhibited different cellular pathways and found that when the autophagy pathway was inhibited, the levels of rgs-Cam and RSSs both increased.  From this they concluded that once rgs-Cam binds an RSS, both are broken down by autophagy-like protein degradation (ALPD).

Interestingly, the authors highlight previous work that reported an inhibition of the RNAi mechanism by high levels of rgs-Cam.  This causes them to speculate that rgs-Cam may be triggered in emergencies only.  When the normal RNAi pathway can deal with the virus, rgs-Cam levels are kept low; however, when the virus gets out of hand, rgs-Cam is upregulated (or de-inhibited, the authors are admirably honest about the remaining ambiguity in the exact mechanism).

Role on the discovery of the first VSHSRSS – Viral Suppressor of Host Suppression of RNA Silencing Suppressor!


Nakahara, K.S. et al (2012) Tobacco calmodulin-like protein provides secondary defense by binding to and directing degradation of virus RNA silencing suppresors.  PNAS Early Edition: 1-6.

Anandalakshmi, R. et al (2000) A calmodulin-related protein that suppresses post-transcriptional gene silencing in plants.  Science 290: 142-144.

Toxic science gets a thorough decontamination by Rosie Redfield et al

Extraordinary claims require extraordinary evidence, as the great Carl Sagan once said.  And they don’t come much more extraordinary than Wolfe-Simon et al’s highly publicized claim that bacteria can incorporate arsenic into their DNA instead of phosphate, and hence that the six fundamental elements of life (carbon, hydrogen, nitrogen, oxygen, sulphur and phosphorus) are not so fundamental after all.

The claim was met with a resounding outcry by the wider scientific community, with extensive methodological criticism highlighting many errors which could have introduced the illusion of arsenate incorporation.  However, Rosie Redfield and colleagues went the whole hog by replicating the study in its entirety, modifying it to include rigorous checks for contamination, the absence of which caused such a negative reaction to the original paper.  The results of this replication were published in Science Express on July 8th.

Wolfe-Simon et al had claimed that their growth medium contained no phosphate, and hence that growth when arsenic was added was a result of arsenate incorporation into the DNA of the bacteria (which was a strain called GFAJ-1).  However, Redfield et al refute this claim by pointing to the large body of literature cataloguing bacterial growth at the level of phosphate (3-4 uM) remaining in this ‘phosphate-free’ medium.  Conclusion: growth of GFAJ-1 when arsenate was added could have been caused by trace levels of phosphate in Wolfe-Simon et al’s medium.

It was also reported by Wolfe-Simon et al that GFAJ-1 cells grew very slowly in the medium, but grew faster when arsenic was added.  This was refuted by Redfield et al, who replicated the level of phosphate in Wolfe-Simon et al’s ‘phosphate-free’ medium, and observed significant growth.  Conclusion: GFAJ-1 doesn’t need arsenic to grow quickly at low levels of phosphate.

The most contentious claim in the original paper was that as much as 4% of the phosphate in the DNA backbone of GFAJ-1 was replaced by arsenate (Wolfe-Simon et al, 2011).  Redfield et al checked for the presence of arsenate bonds after three serial washes of the DNA with distilled water, and found arsenate present at a level of 5 x 10-8 M, 50-fold lower than the 4% claimed by Wolfe-Simon et al.  Incidentally, they found a similar level in their water blank, suggesting that even this low level is a result of remaining contamination.  Conclusion: high arsenate levels in the original paper were most likely a result of contamination, introduced by insufficient washing of DNA.

Logically, one would not expect arsenate bonds to exist, because they have previously been reported to be unstable, quickly breaking down by hydrolysis.  Wolfe-Simon et al had claimed that internal proteins or compartmentalization may protect arsenate bonds from this hydrolysis.  Unfortunately, this claim was also refuted by Redfield et al, who showed by gel migration that GFAJ-1 DNA is not associated with hydrolysis-protecting proteins.  They also compared the size of DNA fragments seen before and after removal of any potential hydrolysis-protecting proteins, and found no difference in the fragment size.  Conclusion: there is no hydrolysis-protection mechanism active in GFAJ-1 DNA, and hence arsenate bonds are not even logically possible in this bacterium.

So it would appear that there is no evidence (extraordinary or not) for arsenic incorporation.  I don’t think I can improve on the conclusion by Redfield:


“The end result is that the fundamental biopolymers conserved across all forms of life remain, in terms of chemical backbone, invariant.”


I suppose there are two ways to look at this.  The first is positive: after all, Redfield et al have ruthlessly employed the scientific method not only to highlight an erroneous claim, but to specify and quantify the source of the error.  But one can’t shake the negative perspective: valuable time has been spent disproving something that should never have been published in the first place.




Wolfe-Simon, F. et al (2011) A bacterium that can grow by using arsenic instead of phosphorus.  Science 332: 1163-1166.

Reaves, M.L. et al (2012) Absence of detectable arsenate in DNA from arsenate-grown GFAJ-1 cells.  Sciencexpress 8 July: 1-4

Recurrent evolution of sticklebacks shown to be due (very broadly) to the same genes each time

Threespine sticklebacks (Gasterosteus aculeatus) have shown a recurrent and relatively predictable pattern of evolution.  Their ancestral home is marine, where they have long spines and a heavily armoured body; however, on numerous occasions they have moved into freshwater, and every time they do they lose their spines and armour.  This is due to a difference in predation pressure.  When in marine habitats, the main predators of the stickleback are birds and fish, which can only eat them if they can swallow them whole.  Long spines make it more difficult for something to swallow you, and armour makes it more likely you will survive a spell in the beak of a bird before being spat out.  But when in freshwater, the main predators of sticklebacks are insects, which grab hold of their prey instead of swallowing them in one.  With this kind of predation, spines and armour are a twofold disadvantage: they give your predator something to grab hold of, and they cost energy to produce, inhibiting body growth and therefore increasing predation risk.

The consistency of stickleback evolution is truly remarkable.  There are many instances worldwide of them moving from marine to freshwater environments, and in every one the same reduction in spines and armour is seen.  However, questions still remain regarding the genetic basis of this recurrent evolution.  How many genes are involved?  Is it mainly changes to protein coding or regulatory genes that enable this recurrent evolution?  And is there a set of “freshwater” genes in the genomes of all populations of sticklebacks, which are repeatedly selected for when they move from marine waters, or are different mutants selected for each time?  These are the questions that a recent open-access Nature paper by Jones et al set out to answer.

The authors took a whole genome approach to the problem, sequencing many different marine-freshwater pairs from across the globe and comparing them to a stickleback reference genome.  This avoided two limitations of previous studies.  The authors could look at every stickleback gene and assess its contribution, rather than deciding a priori to focus on a single gene.  And they stood more chance of detecting patterns of interaction between genes (epistasis), and whether a single trait was being affected by more than one gene (a polygenic trait).

First they generated a reference genome by sequencing one freshwater female, giving them a standard against which to compare the rest of their genomes.  They then chose 10 sites which showed the two characteristic morphs (assessed by morphometric analysis), encompassing both the Pacific and Atlantic Oceans, and sequenced one freshwater (spineless, no armour) and one marine (spiny, armoured) stickleback from each site.  To identify genomic regions under positive selection which would have driven the divergence between the two morphs, they used two methods.  The first was a Hidden Markov Model: this splits the genome into regions, calculates a phylogenetic tree for the 21 individuals at each of these regions, and groups these trees according to similarity.  This method identified 215 regions (90 after filtering) which separated marine individuals from their freshwater counterparts; the authors inferred that these were the regions likely to be under different selection pressures in either habitat.  The second method was a genetic distance approach: this again splits the genome into regions, and calculates for each region a cluster separation score (CSS), to quantify the level of marine-freshwater divergence at that region.  The number of divergent regions recovered by this method was 174 with a 5% false discovery rate (FDR, equivalent to a p value of 0.05), and 84 with a 2% FDR.  Without being overly clear about which filtering they are accepting, the authors conclude that 242 regions (0.5% of the genome) have been identified by either method, and 147 regions (0.2% of the genome) have been identified by both.

The authors therefore regard as settled the question of whether the same variation is reused, or new variation is continually produced: 0.5% of the genome is an incredibly small proportion for recurrent evolution on this scale, and can only be explained by these relatively few genomic regions being used again and again to produce the same evolutionary pattern.

They next looked for what these regions did, by analysing 64 of the most divergent regions.  41% of these regions were non-coding and therefore regulatory, whereas only 17% were coding and showed non-synonymous differences (i.e. produced different protein products in the different environments).  The other 42% were either coding or non-coding, but did not show any non-synonymous differences between the environments.  The authors therefore concluded that regulatory changes account for a large majority of adaptive change.

And here I must confess some puzzlement with the execution of their next step.  The authors chose to test how many regulatory differences existed between the two morphs by sequencing RNA (no problems so far) from a marine and freshwater morph “born and raised under identical laboratory conditions.”  IDENTICAL LABORATORY CONDITIONS.  They found significant differences in the expression of 2,817 genes out of a total of 12,594 (around 22%).  One wonders how many differences would have been seen had they also compared RNA from fish in their natural habitats, the environments under which those regulatory differences have evolved.

So they had found the portions of the genome that differ between freshwater and marine sticklebacks, and had an idea what their function was.  However, answering this question only raises another: if these genes are constantly used during freshwater-marine divergence, how do they avoid being recombined during sex, which would produce an individual with some “freshwater” variants, and some “marine” variants?  As the authors say, “When adaptive divergence occurs in hybridizing systems, theory predicts that selection can favour molecular mechanisms that supress recombination between independent adaptive loci” (Jones et al, 2012).  So these mechanisms are what they looked for next.

To do this they sequenced the genome of a marine and a freshwater morph in a hybrid zone in the River Tyne in Scotland.  Here, even though the two morphs are recombining their genes during mating, only the two distinct morphs survive, with any intermediates selected against.  They then looked for regions which had high CSS scores, and sharp transitions in their CSS scores at their boundaries.  This would act as a signature of an inverted region, which doesn’t undergo recombination and so passes through the generations as either a “freshwater” or “marine” complex.  They found three such regions, on chromosomes I, XI and XXI.  They then cloned these regions into bacteria, more reliably to compare them with the reference genome, and more easily to sequence their surrounding regions.  When clones were compared with the reference, only chromosomes I, XI and XXI were anomalous, further confirming their status as inversions.  Inverted repeats were also found in the sequence of their surrounding regions, a signature of inversion generation.  Cluster separation scores for the regions confirmed that marine and freshwater sticklebacks carry different forms of the inversions.  Finally, they looked for functional significance of these regions.  They found that the inversion on chromosome XXI contains “separate QTLs controlling armour plate number and body shape, traits that differ between marine and freshwater fish” (Jones et al, 2012).

So how successful have the authors been in answering their questions?  The first has been an undeniable success: that such a small fraction of the genome is consistently found to produce such large phenotypic changes is convincing evidence that the same genes are used repeatedly, rather than new mutations being required every time freshwater is invaded.  However, regarding the function of genes and the relative importance of coding and regulatory change, valuable initial data has been produced here, but no strong conclusions can be drawn from them.  The data here allow hypotheses to be made and candidate genes to be identified; however, experimental manipulations and data from multiple generations will be needed before conclusions can be drawn with any validity.


P.S. This paper was published as an open-access article, meaning an institutional login or massive payout is not required to read it, and that figures and content can be reproduced with a citation.  Let’s hope this soon becomes the norm (and that in the future, authors are not required to foot the bill to make their work open access).




Jones, F. et al (2012) The genomic basis of adaptive evolution in threespine sticklebacks.  Nature 484: 55-61

Peichel, C. and Bougham, W. (2006) Quick Guide: Sticklebacks.  Current Biology 13: 942-943

Marchinko, K. (2009) Predation’s role in repeated phenotypic and genetic divergence of armor in threespine stickleback.  Evolution 63: 127-138

Reimchen, T. and Nosil, P. (2004) Variable predation regimes predict the evolution of sexual dimorphism in a population of threespine stickleback.  Evolution 58: 1274-81

Vamosi, S. and Schluter, D. (2004) Character shifts in the defensive armor of sympatric sticklebacks. Evolution 58: 376-85

“The single greatest experiment in the history of biology”

“The single greatest experiment in the history of biology.”  Quite an accolade.  Yet this is how Richard Lenski (no stranger to seminal investigations himself) described work carried out in 1943 by Salvador Luria and Max Delbrück.

Picture yourself in 1943.  Evolution has now been established as the explanation for the seemingly-designed nature of life, but many questions still remain.  One of the greatest in both its simplicity and importance is this: does genetic variation occur because of the action of selection, or is it present before selection acts?  This is so fundamental to our understanding of evolution as to seem obvious; however, before 1943 the issue was still very much up in the air.

The structure of DNA was not to be discovered for another 10 years, somewhat precluding the analysis of protein structure or DNA sequencing.  Luria and Delbrück therefore used a far more elegant method to infer whether bacteria had mutated before or after selection. They first calculated the expected theoretical variability in growth when multiple colonies were exposed to viruses under either hypothesis (mutation-endowed resistance or acquired resistance).  They then experimentally measured growth of multiple cultures of E. coli when exposed to viral challenge, and compared their experimental findings to the theoretical expectations under each hypothesis.  If the experimental measures matched the expectations of one of these hypotheses, the question would be answered.

To carry out their experiment they established hundreds of cultures from a single bacterial cell each, and grew each culture up for a set time.  They then assessed their viral resistance by plating them onto virally-infected agar, and observing how many colonies formed (and therefore how many bacteria from the original culture were resistant).

Their theoretical calculations threw up some interesting findings (trust me here).  They modelled the frequency distribution (where a category is plotted on the x axis, and the frequency of individuals in that category is on the y) of the number of resistant bacteria in multiple individual cultures expected under each hypothesis.  They found that under the mutation hypothesis, there will be a high variance between different cultures, with some cultures with high numbers of resistant bacteria, causing this distribution to have a long tail at the right hand end.  Both of these expectations stem from the many generations in which resistant mutants can arise: if a mutant arises during the first bacterial generation, half of the bacteria in that culture will be resistant to viral infection; if a mutant arises during the last generation, there will be only one resistant bacterium.  In contrast, under the acquired hypothesis, each bacterium has the same probability of becoming resistant to the virus upon being exposed to it, and each becomes immune or not at the same moment (when they are plated onto the viral agar); therefore the frequency distribution under acquired immunity will have very short tails, and low variance between cultures.

So they knew their expectations.  All that remained now was to expose E. coli cultures to viruses, plot the frequency distribution of resistant bacteria, and measure the variance between cultures.

Their findings emphatically supported the mutation hypothesis.  The frequency distributions of the different cultures formed a curve with a long right-hand tail, and revealed a large number of cultures with over nine resistant bacteria, as predicted by a resistant mutant arising early in the bacterial pedigree of a culture.  The variance between different cultures was also massively higher than their theoretical predictions under the acquired immunity hypothesis, and even higher than they predicted under the mutation hypothesis.

These results are simply inexplicable under the acquired immune hypothesis; the probability of a culture having over nine resistant bacteria is astronomically low, and the repeated observation of such cultures is nigh-on impossible.  However, they are perfectly explained by the mutation hypothesis.

Luria and Delbrück’s work also helped to settle a more philosophical question: is evolution a guided process?  By demonstrating that evolutionary change depends upon mutations that are randomly generated, they banished any element of guidance or divine, benevolent intervention from the evolutionary process.  When selection pressure is applied, time is up: the genetic variation has to be there already.




Luria, S.E. and Delbrück, M. (1943) Mutations of bacteria from virus sensitivity to virus resistance.  Genetics 28: 491-511

Lenski, R.E. (2011) Evolution in action: a 50,000-generation salute to Charles Darwin.  Microbe 6:30-33

The Salvador E. Luria Papers.  National Library of Medicine: Profiles in Science, accessed from

Luria-delbruck diagram.svg, Wikipedia, accessed from

Cool critters caught on camera!

I recently took a trip to Cumbrae, a small island off the west coast of Scotland.  All creatures great and small were plucked from pools and dredged from the depths, and I present the following highlights for your viewing pleasure.

First up we have the green paddleworm Phyllodoce lamelligera

And here it is extending its proboscis, most likely as a result of the stress of being on camera

Next we have Limacia clavigera, a nudibranch (sea slug)

Then there’s my favourite: Ophiocomina nigra, a brittlestar.  Its black colouration obscures the pentameral symmetry of its central disc, a body pattern so pervasive that its mouth is bordered by five jaws

Our penultimate aquatic invertebrate is a sea spider, Pycnogonum littorale

And finally we have a tardigrade of the genus Macrobiotus: looking at such powerful, aggressive behaviour you can easily tell how the group has earned the nickname “waterbears”

So with summer coming up, think twice before booking that Caribbean cruise: a rainy and cold Scottish coast might be much more beautiful!


PS. I should point out that none of the above were collected, identified or photographed by me: thanks very much to those who did the hard work.