The most common viruses in your body don’t make you ill. Instead, they infect the legions of microbes that live in your gut. These bacteriophages, or phages for short, number in their trillions. And the most common of them might be a newly discovered virus called crAssphage.
No one has seen crAssphage under the microscope, but we know what its genome looks like—Bas Dutilh from Radboud University Medical Centre pieced it together using fragments of DNA from the stools of 12 individuals. He found crAssphage in all of them. Then, he found it in hundreds more.
To study the microbes that live in a person’s guts, scientists will typically collect a stool sample, break all the DNA within into small fragments, and sequence these pieces. The result is a metagenome: a mish-mashed collection of DNA from all the local bacteria, viruses and other microbes.
Dutilh’s team, led by Rob Edwards at San Diego State University, analysed 466 metagenomes that have been added to public databases and found crAssphage in three-quarters of them. It’s there in stool samples from people in the USA, Europe and South Korea. It actually accounted for 1.7 percent of all the sequences that the team analysed—six times more than all the other known phages put together. You probably have it inside you right now.
The work highlights just how much we don’t know about the viruses in our guts and “what exciting times these are for viral discovery”, says Lesley Ogilvie from the Max Planck Institute for Molecular Genetics.
But how could such a common virus go undiscovered for so long, especially considering how popular the study of gut microbes has become? It’s as if zookeepers suddenly realised that most of their zoos contain a giant grey animal with tusks and a trunk, which no one had noticed before.
For one thing, the viruses in our guts are hard to study. “To study a virus, normally you have to make heaps of it, which isn’t possible if you can’t grow the host,” says Martha Clokie from the University of Leicester. And since most gut bacteria won’t grow easily in a lab, the viruses that infect them are similarly hard to rear.
The alternative is to use metagenomics to analyse a microbe’s genes without having to grow it. But first, you have to assemble your mish-mash of sequences, which come from different organisms, into a complete genome. It’s a bit like putting all the pieces of a thousand jigsaw puzzles into one bag, and trying to solve just one.
The usual strategy is to work off what you know by aligning these new sequences to those in databases. But this approach doesn’t work very well for our inner viruses because most of them are unknown. The sequences in the databases represent the tip of the iceberg. According to Dutilh, around 75 percent of the DNA from any new stool sample—and as much as 99 percent—won’t match any of these known sequences.
So what’s in that other 75 percent?
Well, crAssphage for starters.
Dutilh’s team found it by using a different approach based on a simple idea: that fragments which repeatedly turn up in the same samples are more likely to be parts of the same genome. They used a technique called cross-assembly to identify one such group of co-occurring sequences, in stool samples from 12 people. They then assembled these sequences into a single genome.
The genome had several distinctive features which told the researchers that it belonged to a phage, albeit one that’s very different to any we currently know of. They called it crAssphage after the cross-assembly method that revealed its existence.
They used the same technique to work out what the virus infects: if there’s lots of crAssphage DNA in a sample, there should also be lots of DNA from its host. Based on this logic, the most likely hosts are a group of bacteria called Bacteroides.
The team checked this result with a second technique. They looked at CRISPR sequences—a kind of bacterial immune system that recognises DNA from infecting phages. The team scanned all known bacterial genomes for CRISPR sequences that matched crAssphage and found that the closest matches came from two groups of gut bacteria, one of which was Bacteroides.
Bacteroides are major players in our guts. They help us break down our food, control the development of our immune system, and protect us from disease-causing bacteria. Their numbers change depending on the food we eat, and they correlate with our risk of different diseases. If crAssphage infects these microbes, it could also be an important player in our daily dramas.
It’s too early to speculate what its role might be, says Dutihl. Still, we know that phages are generally important. By killing off the most abundant bacteria in the gut, they ensure that no single species can monopolise the space. And last year, Jeremy Barr, who was involved of this new study, showed that phages could even act as part of our own immune system.
Many scientists had assumed that viruses in the gut are caught up in fast-paced evolutionary battles with local bacteria. This leaves people with very different collections, and explains why most of the viral sequences that we find don’t match anything in the databases. But the existence of crAssphage challenges this concept: it was part of the pool of unknowns but it’s also incredibly common. “It definitely changes the idea we had about viruses being very individual-specific,” says Dutihl. The study of human gut bacteria followed a similar path: early studies highlighted the differences between us but important similarities started emerging as our techniques became more sophisticated.
There are probably many more common viruses waiting to be discovered. “The biggest contribution of this work is the method they used,” says David Pride from the University of California, San Diego. “It provides a blueprint for further viral discovery.”
“What are we missing when we are unable to classify a sequence? What do we do with all of the sequence reads that we can’t classify? These are tough questions that we’ve been thinking about for years,” says Kristine Wylie from Washington University in St Louis. “This paper demonstrates that the community is developing clever approaches that can be used to mine those data.”
Reference: Dutilh, Cassman, McNair, Sanchez, Genivaldo, Silva, Boling, Barr, Speth, Seguritan, Aziz, Felts, Dinsdale, Mokili & Edwards. 2014. Unknown sequences in faecal metagenomes reveal a widely distributed and highly abundant bacteriophage. Nature Communications. http://dx.doi.org/10.1038/ncomms5498