Think of the open-source project Nextstrain.org as an outbreak museum. Labs around the world contribute genetic sequences of viruses collected from patients, and Nextstrain uses that data to paint the evolution of epidemics through global maps and phylogenetic charts, the family trees for viruses.
So far, Nextstrain has crunched nearly 1,500 genomes from the new coronavirus, and the data already show how this virus is mutating—every 15 days, on average—as the COVID-19 pandemic rages around the world.
As menacing as the word sounds, mutations don’t mean the virus is becoming more harmful. Instead, these subtle shifts in the virus’s genetic code are helping researchers quickly figure out where it’s been, as well as dispel myths about its origins.
“These mutations are completely benign and useful as a puzzle piece to uncover how the virus is spreading,” says Nextstrain cofounder Trevor Bedford, a computational biologist at the Fred Hutchinson Cancer Research Center in Seattle.
This genetics-first approach to tracking the coronavirus has emerged as a bright spot among the barrage of devastating pandemic headlines. Similar science was instrumental in decoding previous epidemics, such as Zika and Ebola. But experts say the declining cost and increased speed and efficiency of genetic sequencing tools has made it possible for a small army of researchers around the world to document the coronavirus’s destructive path even faster. Those insights can help officials choose whether to shift from containment to mitigation strategies, especially in places where testing has lagged.
“If we go back to the Ebola virus five years ago, it was a year-long process from samples being collected to genomes being sequenced and shared publicly,” Bedford says. “Now the turnaround is much faster—from two days to a week—and that real-time ability to use these techniques in a way that impacts the outbreak is new.”
Tracking cases through mutations
Bedford’s lab has been using genetics to track the new coronavirus, known as SARS-CoV-2, since the first U.S. cases started to multiply in Washington State in February and March. Back then, public health officials focused on tracking patients’ travel histories and connecting the dots back to potentially infected people they’d met along the way.
Meanwhile, Bedford and his team turned to unlocking the virus’s genetic code by analyzing nasal samples collected from about two dozen patients. Their discovery was illuminating: By tracing how and where the virus had changed over time, Bedford showed that SARS-CoV-2 had been quietly incubating within the community for weeks since the first documented case in Seattle on January 21. The patient was a 35-year-old who had recently visited the outbreak’s original epicenter in Wuhan, China.
In other words, Bedford had scientific proof that people could unknowingly be spreading the coronavirus if they had a mild case and didn’t seek care, or if they had been missed by traditional surveillance because they weren’t tested. That revelation has fueled the frantic lockdowns, closures, and social distancing recommendations around the world in an attempt to slow the spread.
“One thing that’s become clear is that genomics data gives you a much richer story about how the outbreak is unfolding,” Bedford says.
Nextstrain’s visualization tools have also helped engage a public that’s hungry to learn about the science of the coronavirus, says Kristian Andersen, a computational biologist at Scripps Research in La Jolla, California, whose lab has contributed more than a thousand genomes, including West Nile and Zika viruses, to the project.
“I like these tools because for the longest time it was just nerds like me looking at these trees, and now it’s all over Twitter,” says Andersen. The site’s open-source ethos has also generated genome-sharing enthusiasm among researchers around the world, who offer to send viral samples to his lab or who contact him looking for specific advice on how to sequence the virus. “They see the data display and say, ‘We have patients, too. We’d like to sequence them.’”
Although such charts and trees are useful for seeing the big picture of how the pandemic is unfolding, Andersen cautions random visitors against jumping to conclusions, because they can’t see the more extensive background data. Case in point: Bedford had to back-track on Twitter after suggesting that similar sequencing data from an infected German patient in Italy and a Munich patient who became infected a month earlier showed that the European outbreak had started in Germany.
“The tree might suggest a connection, but there are so many missing pieces in the transmission chain that there can be other explanations of what could have happened,” says Andersen.
And in places where testing and case-based surveillance are limited, Bedford says genetic data will continue to provide clues about whether all these social distancing interventions are working.
“We’ll be able to tell how much less transmission we’re seeing and answer the question, ‘Can we take our foot off the gas?’” he says.
Not a bioweapon
In addition, the ability to reveal the virus’s evolutionary history helped researchers quickly debunk conspiracy theories, such as the one that SARS-CoV-2 was secretly manufactured in a lab to be used as a bioweapon.
A March 17 article in Nature Medicine co-authored by Andersen makes this argument by comparing the genomic features of SARS-CoV-2 with all of its closest family members, including SARS, MERS, and strains isolated from animals such as bats and pangolins.
First off, most of SARS-CoV-2’s underlying structure is unlike any of coronaviruses previously studied in a lab. The novel coronavirus also contains genetic features that suggest it encountered a living immune system rather than being cultivated in a petri dish.
Moreover, a bioweapon designer would want maximum impact and might rely on history to obtain it, but the novel coronavirus carries subtle flaws indicative of natural selection. For instance, coronaviruses use what are known as spike proteins, which look like heads of broccoli, to bind and access cellular “doorways” called receptors. It’s how the viruses infect animal cells. Experiments have shown that the novel coronavirus strongly binds with a human receptor called ACE2, but the interaction isn’t optimal, the authors explain.
“This isn’t what somebody who wanted to build the perfect virus would have picked,” Andersen says. Overall, their analysis suggests the virus jumped from an animal to humans sometime in November.
In the future, genetic sequencing will become an even more important tool to identify local or regional viral flare-ups before they spread.
“If a potential virus was to emerge in a community in Africa, for example, we now have the ability to get samples to the lab to do shotgun sequencing,” says Phil Febbo, a physician and chief medical officer of Illumina, the world’s largest genetic sequencing machine manufacturer, based in San Diego, California. The shotgun technique allows researchers to sequence random genetic strands to identify a virus at a faster pace, so that officials can more quickly determine appropriate containment measures to stop transmission.
There’s still a lot of work to be done to create such a rapid-response global surveillance network: Labs have to be created. Governments have to get on board. Workers need to be recruited and trained to run sequencing machines and interpret results.
“It’s not a limitation of technology,” Febbo says. "It’s a matter of finding the right resolve as an international community.”