Understanding the origins of the Covid pandemic matters because the answer should guide how much we regulate work on pathogens that could start another pandemic. Did it come from some sort of lab leak (LL) or did it spill over from some animal host unconnected to research, by more traditional natural zoonosis (Zoo)? Although most people in the US and majorities or pluralities in many other countries think it came from a lab leak, the virology journals and the most prestigious general science journals are strongly on the Zoo side.
The question has gotten all tangled up with politics in a way that doesn’t help in sorting out the facts. (The current political lineups on this are a bit odd, since funding of the suspect type of research was nominally banned under President Obama and then allowed under President Trump.) One of the reasons that I think it’s important for scientists to speak out honestly on the subject is that otherwise we open the field more to a collection of anti-scientific cranks, especially ones who are anti-vax or even anti–air filtration. If the latter are the only ones making sense when talking about Covid origins, that makes it harder to argue that people should pay attention to scientists on other matters, not only Covid-related but also on climate.
How can a non-expert sort through the evidence to see which explanation is more likely? I don’t know any easy, quick way. One could try trusting the majority of the relevant experts, but that doesn’t always work when many of the experts themselves have too much at stake to be open with the public. On the other hand, Alina Chan recently published a persuasive article in the New York Times citing quite a bit of evidence that an accidental LL was more probable. Who to believe?
I’ve taken advantage of free time and scientific ties to try to sort this out. The bottom line, for which I’ll give some of the arguments here, is that it’s much more likely to have come from a lab accident. (Some crackpots imagine a deliberate release, but I won’t waste time discussing that.) My painfully long version of the argument, more quantitative and complete than Chan’s article and with more extensive links to references, is posted online.
The method I employed to compare the odds of the possibilities is called Bayesian probability. It’s a valid method that can be used for a wide variety of decisions that have to be made under uncertainty—i.e., almost all important decisions. Perhaps that makes the method even more important than the conclusion. Introductions to it are easy to find online.
Here’s the basic Bayes method for deciding which of two competing stories is more likely to be true. You start with some odds that could favor one or the other just based on background knowledge. Then each new observation can swing the odds one way or the other, depending on whether it’s more surprising in one story or the other. Bayes’s theorem tells you how to translate the surprise into a numerical odds ratio.
For example, there could be two explanations for why my neighbor’s furnace exploded. One is that it was some breakdown of old equipment, the other is that there was a screwup in maintenance. A new maintenance guy had worked on the furnace the day before it exploded. Maybe maintenance had nothing to do with it, but then the timing would be a surprising coincidence. The timing would not be a surprise for a screwup. The timing supports but doesn’t prove the screwup story.
Here’s a rough ultra-condensed version of the Covid argument. We can get reasonable starting odds that pretty heavily favor Zoo from looking at past zoonotic epidemics and past lab leaks. We see lots of features that would be less surprising for LL than for Zoo. They shift the odds enough to end up making LL more likely than Zoo.
Here’s the most basic feature. The chances for LL come mostly from Wuhan, where a detailed research proposal (“DEFUSE”) described plans to patch together viruses like SARS-CoV-2. (A US agency turned down DEFUSE, but a Chinese Academy of Sciences grant quickly filled in the Wuhan funding.) The chances for Zoo come from locations spread over China and southeast Asia. Wuhan only has approximately one percent of China’s population. If those chances were concentrated anywhere, it would be far to the south of Wuhan, where there was much more wildlife trade and where related viruses circulate. Knowing the pandemic started in Wuhan eliminates most of the chances for Zoo, but not for LL. A virus starting 1000 km south of Wuhan could first show up in the tiny Wuhan wet markets a tram ride from the lab planning to work on those viruses rather than in the massive wet markets in the south, but it’s just very unlikely. Our starting odds now shift toward LL by a factor of about 100, because we’ve scratched off at least 99 percent of the routes to Zoo but hardly any of the likely routes to LL.
Similar reasoning provides other important factors shifting the odds toward LL. Some features that would be expected for LL but surprising for Zoo are that no original animal host has been found; there were no independent outbreaks in other cities; and the virus was unusually well-adapted to humans right from the start. The viral sequence has several more detailed features that are highly unusual in nature but right along the lines proposed in DEFUSE. Chan’s article and my blog post describe those in detail.
What about the science that’s supposed to show that the virus spilled over from some unknown animal at a wet market? This is where the story gets to be embarrassing for science itself. Chan’s article partially describes the extreme problems with the key papers pushing the wet market story, giving some references [non-paywall here]; and my blog goes into more depth. The journal Science, unfortunately, has resisted correcting all but a few of the most blatant coding errors. The scientists who caught them have had to publish about the other errors in other respectable journals, but ones less likely to catch the attention of the press.
Putting it all together, I get that the odds favor LL by about 500 to 1. The only reason that the odds don’t come out even more extreme is that I allow for a lot of uncertainty in estimating each of the factors.
One conclusion is simple. Playing with making more dangerous pathogens, called “gain-of-function research of concern,” should be banned. It has not produced any new therapies or vaccines, and was already known before Covid to risk a pandemic. Now it has probably produced an ongoing pandemic that has caused over 25 million excess deaths worldwide, and many more chronic illnesses. I think that the soft new regulations being implemented by the Biden administration won’t be nearly enough. We need laws with some teeth, applying to private research as well as to government funding. We’ll also need international agreements, since no matter where a pandemic starts it will affect the whole world.
This still leaves a problem for which I have no solution. Some crucial questions, such as how to fight diseases or how to avoid climate catastrophe, can’t be answered without some scientific analysis. Unfortunately, doing science requires a lot of time and attention. Falling back on trusting scientific authorities often works, but runs into problems when the authorities themselves are not open with the public.
The deepest lesson from all this is not that “research is the main source of pandemics.” The most likely source of the next one is the H5N1 influenza that is being shared back and forth between birds, cows, other mammals, and now sometimes people. Unlike for Covid, which popped up suddenly ready to go, we’re watching the evolution of H5N1 in real time. We’re not doing nearly enough to track it and protect exposed people, mainly farmworkers. Although one lesson from Covid is to stop pointless risky research, the more general lesson is to pay attention, be honest, and to use government to act before it’s too late.
Michael Weissman is a retired UIUC professor of physics. He did time for Vietnam War draft resistance and was an originator of the scientists’ boycott of Reagan’s Star Wars program. Since retirement he has used statistics to help catch two fake pollsters and to uncover serious errors in many physics education research papers.