Another paper about another phage protein isn’t usually cause for notice. There are lots of phage, and they have lots of proteins, and figuring out what they all do could occupy the efforts of scientists for several millennia.
Which is precisely my point.
What’s sometimes lost in all the excitement about genomics is that it is much easier to collect sequences than to assign meaning to them. The total number of gene sequences is now approaching a billion, representing nearly 10 trillion bases of sequence data. About 30% of the protein-coding genes in both humans and bacteria are of unknown function. They are there, and we can show (for bacteria, at least) that they are essential, but we don’t know why they exist or what they do. They are a mystery, the dark matter of biology.
From Dark Matter Recipe Calls for One Part Superfluid
But animal genes represent only a small fraction of the total sequence diversity present on Earth. Even though we don’t know what a lot of animal genes do, the overwhelming majority of them have relatives in other organisms, allowing us to bootstrap our way into analyzing their function by computational methods.
No such luck with viruses, including bacteriophage. About 75% of all viral genes are completely novel and have no relatives among the other billion sequences in GenBank. And it is estimated that 99% of viral genes remain to be discovered.
But we do have some clues. About 2/3rds of phage genes can be knocked out without impairing lytic growth on permissive hosts. Keep in mind that phage have no “junk DNA” – all of it either codes for proteins or controls protein expression. The usual explanation for their apparent dispensability is that they are needed to maintain viability in a wider range of conditions that those found in a test tube.
And that’s consistent with the findings of Manning et al – their unknown protein (gp44 of the well-known S. aureus phage 80α) appears to promote the phage’s transition to lysogeny (integration into the host chromosome).
It’s a reasonable guess that much of phage genomic dark matter consists of regulatory proteins that guide its decision-making, expanding the range of hosts and conditions under which it can successfully propagate. Another substantial fraction is probably dedicated to fending off host defenses, and in turn warding off other bacteriophage that want a share of the host’s resources.
These processes have the dynamics of a Red Queen race, in which each side is constantly under pressure to create novel attacks and defenses, prompting the deployment of equally novel countermeasures.
We’ve already profited greatly by the study of this ceaseless battle: most of the enzymes used in genetic engineering, including CRISPR, are either encoded by phage or are encoded by bacteria as a defense against phage. The likelihood that we can find more of value – given that we have sequenced perhaps 1% of the genes out there – is near 100%.
Even more tantalizing is the possibility that many of these genes are used to manipulate bacterial behavior, especially (from a clinical standpoint) bad behavior. Nearly all virulence genes in S. aureus, including the genes that code for methicillin resistance (MRSA), are on mobile genetic elements either carried or mobilized by phage. It’s a good bet that phage encode proteins which modify or suppress these virulence factors – and thus might have therapeutic value in disarming invasive infections.
Sequencing the human genome is all well and good, but I am dubious that we will get much more therapeutic value from these efforts. It’s time to leave our familiar little genomic planet and start exploring the dark matter regions of the biosphere.
Hey Drew! Jim Manser (remember me?) here! Found your blog through MCDB 50th anniversary celebration. Excellent stuff and very well written! Didn’t know (or had forgotten) that you’re a fellow Grand Canyon Stater (me=Phoenix aka “Furnace”). Keep up the good work! Cheers, -Jim
Hey Jim – good to hear from you. If you are ever overcome by the urge to write a blog post or two I’d be happy to publish it here. Best, Drew