Over the last 48 hours I’ve been percolating about the massive (30,000+ patient) RCT that has underpinned the Sputnik trial and led to millions of doses being sold by the Russian Government’s RDIF.
I spent most of those tossing up between whether this really was just luck or was fraud. I’ve now found what I believe to be the smoking gun to show that the trial is faked.
Part one: Long odds and long log odds.
I was recently reading the Phase III trial results for the Sputnik V vaccine for Covid-19 in the Lancet and the results are impressive, a 92% vaccine efficacy is nothing to sneeze at. In fact it is very effective in all age groups, and essentially identically so. But then I noticed something odd.
Looking at the actual numbers involved the authors got incredibly lucky. The results for efficacy are improbably uniform across ages. Every single result is 90-93% and every single one can be pushed outside this by adding or subtracting a single case. The actual numbers involved are tiny.
For instance in the first age group only a single person in the vaccine group got covid. And it’s lucky they did. If they didn’t, or a single extra person got covid the results would be way off. With 1 fewer case the result would be 100% effective, with one more it would drop to about 85%. In fact this gives an idea how fragile these results are. Achieving such a tight band of results on such tiny numbers is very lucky. Achieving it five times a row is even more unlikely.
But how unlikely?
There are numerous possible ways to compute this.
On my twitter I’ve done a rough hack-and-slash approximation with binomial probability calculator estimating it’s a 1 in 1,000 to 1 in 10,000 chance. While it’s quick and good enough for twitter there’s some problems with it (especially in conferring cumulative probability based on aberrating by 1).
Another way would be to treat each arm as an independent trial, this is valid, because even if the underlying effect IS identical, they’re still independent.
So I did that, and whacked it into Stata (which I bought a license for today, because apparently my uni thinks they shouldn’t provide a license for the medical faculty, only business).
So what did we get?
Well the first thing you’ll note is how incredibly lucky the authors got, visually. The square for each age group is the actual result. The further to the left it is the more protection from the vaccine. But the confidence intervals are incredibly wide. We can only be 95% chance the result fell somewhere between 50% effective and >99% effective.
Despite these wide confidence intervals they got bang on the money every time.
But how unlikely is this?
Well we would have expected about 996.4 out of 1,000 repeats to be less even between group. So the researchers got about 3 in 1000 lucky by this estimate.
There are lots of ways to come at this question. I doubt you’ll find anybody who thinks this is more likely than something on the order of x per 1,000 or less likely than on the order x per 100,000. Either way it’s an very rare level of luck to get such very, very similar results with such a tiny number of events.
But luck happens.
There’s no statistical threshold that is impossible. A 1 in 100 result happens in 1 in 100 trials. A 1 in 1000 result happens in 1 in 1000 trials, and so on.
So did the authors get lucky or is it fraud?
Part 2: This isn’t the only vaccine.
I had a number of good conversations on twitter about this trial when I posted my concerns. Two in particular were helpful, one encouraging and one cautionary. Talking to David Manheim (https://twitter.com/davidmanheim) was helpfully cautionary, the inference only works if you assume the trial was faked outright, not just bumped up to make the trial more impressive. And it’s a massive undertaking by thousands, state sponsored and with massive coverage. The pre-test probability of fabricated data has to be low. He emphasized that there are lots of trials and rare events on a single trial basis do occur eventually.
Talking to Florian Naudet (https://twitter.com/NaudetFlorian) was reassuring, he had similar concerns and had published a letter in the lancet but was running into the same wall I was, there’s no level of improbability alone that shows something to be impossible. Rare events happen rarely, but they do happen.
So I decided to sanity check by looking for comparators.
After all Moderna, Pfizer, AZ, J+J all have vaccines that went through phase 3 trials, how homogenous were their results by decade? This led to hours of frustration. I searched high, I searched low, but not one other phase 3 trial by any manufacturer broke down the data into such fine age gradations.
Then it hit me.
Of course they didn’t.
Nobody would plan a trial to do an analysis that only works less than 1% of the time? Why would any statistician make that decision? These are trials of 20-50 THOUSAND patients, costing tens of millions of dollars. These aren’t exactly undergraduate student projects.
It would be crazy to plan this analysis up front.
But to be fair we don’t know that’s what the authors did. (Yet, keep reading.)
So we’re really left with two possibilities:
- The authors could somehow guarantee IN ADVANCE they would get far more homogeneous data than was statistically likely (i.e. fraud).
- The authors got very very unexpectedly homogenous data between groups, so decided to add on an analysis that only makes sense in this setting AFTER they had already seen their very lucky data (i.e. the “if you’ve got it, flaunt it” principle).
Part 3: A Question of Timelines.
One of the great things about big journals is that they check things. One of the things top tier journals (NEJM, JAMA, Lancet) are absolute on is they won’t accept a trial that isn’t pre-registered. You can’t come to a journal after the fact and say “so we did this”, you have to publicly put on a register what you are testing and how you define success.
So I went back to the Lancet paper. The Sputnik V trial was registered, and on a register that has absolute transparency of version history.
On 27th of August 2020, the researchers submitted their plan for decade by decade analysis to the registry. The month BEFORE the trial even started.
The researcher planned this analysis, which no other vaccine manufacturer did, which had a tiny chance of working as it was wildly under-powered, that relied on the wildly improbably lucky data they collected. The only real explanation here is that they knew what their results were before they injected the first patient. It’s a smoking gun for fraud.
Part 4: Summing Up
I can accept that people get lucky. 1 in 100 chances happen 1 time in 100, 1 in 1000 chances 1 time in 1000.
But I can’t accept knowing you’re going to get that lucky months in advance. These researchers knew they were by August 2020, before the first patient was enrolled.
That can only be fraud.
This trial should be treated as fake. The Lancet article should be retracted immediately. The EMA and other regulators should not move ahead with trusting this trial, even if the missing paperwork is found.
Part 5: Pre-empting the responses from patriots and troll farms
One argument will be that this isn’t an undepowered study and it was a reasonable analysis because the trial was planned to be larger and run for longer. The luck is just that the homogeneity came so early. That’s rubbish and here’s why.
It ignores the LAYERS of luck and MAGNITUDE of luck that went into this result: not only was the result more homogenous than anybody could have (legitimately) known before it started, the total numbers are higher too. The placebo group had an infection rate more than 1,000% higher than the rate in Russia at the time the trial was planned. Even if we assume they though the rate would go up a bit, the number of events observed are still almost triple the number expected if every day for every patient on the trial had the rate of Russia’s highest rate of infection ever recorded.
If we calculate the number of infections in the first age group with TWICE as many patients (so more than were planned) followed up for the full six months, AND assume it’s 6 months + the 21 days to be generous, using Russia’s infection rate at the time, and assume an efficacy of 75% the trial was approved we get roughly:
6 cases in 1000 controls
4 cases in 3000 vaccinated
The 95% CI for that OR ranges from 0.06 to 0.79. From the most effective Covid vaccine on earth to clinically useless.
NOBODY plans a trial this way; with numbers so small that even a highly effective intervention is quite likely to look like a dud by random chance. At least, nobody who has to rely on an experiment to generate their results does.