Lisa Bero on bias in clinical trials
October 29, 2008
This is fifth in a series of notes from the ScienceWriters2008 meeting. Lisa Bero gave a talk on “Faulty clinical trials and financial conflicts of interest.” [10:18]
Lisa Bero started off with a diagram of “the cycle of bias in research”. Your research question influences the population you test it on, which influences your method, which influences how you conduct the study, which influences what you publish.
A published study influences the questions that form in other scientists’ minds as they set off on their own cycles, but can also affect meta-analyses based on the research, and can filter into guidelines for patient care.
One of the sources of bias Bero is most concerned about is, simply, financial conflicts of interest. In 2004, the National Cholesterol Education Program updated its guidelines for cholesterol treatment, and overnight the number of Americans who should be on cholesterol-lowering meds shot from 13 million to 40 million. Eight of the nine experts recommending the change had personal financial ties to drug companies who make statins, and the evidence they relied on came from five randomized clinical trials (RCT’s) that were all funded by makers of statins.
Why worry about the funding of studies, though, if the study has been reviewed and the science is good? It turns out, in one of Bero’s own meta-analyses, that a drug company funded study is four times more likely to turn up a result favorable to its own product than is an independent study.
While clinical trials tend to compare a new drug to a placebo, a better way to study bias is in head-to-head comparisons that pit two competing drugs against each other. If Drug A fares better than Drug B, you would expect the same results no matter who funds the study. But it turns out that Company A’s results are the reverse of Company B’s for the same comparison.
There are many possible reasons for this bias, which Bero couldn’t pin down for sure. The difference could be in how scientists frame their question, how they design the study, how they conduct the study, and in the decision of whether a given result is worth publishing.
One source of bias that is fairly easy to demonstrate is dosage. If you want to compare your drug’s effectiveness to your competitor’s, you can give your drug in a higher dose and Brand X at a much lower level. If the drugs are equivalent, patients will experience more relief from the higher dose. On the other hand, if you want to show that your competitor’s drug has more side effects, you’ll give that drug in a higher dose than your own. The trick is to test those hypotheses separately, which is often the case.
In an analysis of 56 NSAID trials, 40 showed similar results from both drugs, 16 showed a stronger effect from the manufacturer’s own drug, and zero came out in favor of the competitor. Bero’s team compared these results to the dosage of the drugs (in terms of their own dose-response curves, of course) and found that the competitor’s drug was usually given at a very low dosage – no wonder it had less of an effect.
Another sneaky way of reporting results in an extra-flattering light is to focus on the p-value (if you have a good p-value to show). This is the number that determines “statistical significance”, and statisticians like to see it at a level of 0.05 or less.
In one study about the constipation drug Zelnorm, the p-value showing its effectiveness was p < 0.0001. That means the researchers are 99.99% sure that Zelnorm caused patients to poop successfully – sounds great, and as far as we know the p-value was correct.
The problem (aside from the fact that it caused heart attacks, which is a different story) is that it only had that effect in fairly few patients. With a placebo, 27% of patients had one extra poop per week. With Zelnorm, 40% did. There are several ways you can report that difference numerically, but one of the most intuitive (according to Bero, among others) is the NNT, or Number Needed to Treat. In this case, you have to give Zelnorm to 7.4 people to get just one patient with the desired (tee-hee) outcome. (Here is some more discussion of NNT – be sure to check out the chart on the side.)
So with those numbers, the drug didn’t work very well, even though the result was statistically significant. It caused the desired pooping, but not in very many patients. If you took Zelnorm, you had less than a 1 in 7 chance of experiencing that statistically significant effect. Bero ran some numbers up on the screen – it worked out to $155 per poop.
Boro then looked at a well-studied drug whose name I don’t remember. In the first study, the desired effect was observed, but the confidence interval was so large as to be meaningless. When you add in participants from the next study, and the next, and the next, the confidence interval shrinks substantially. That’s a good thing – it means that the drug’s effect is being pinned down to a narrow range of numbers. After about 5000 patients’ worth of studies, it’s obvious what the drug’s effect is, and that the confidence interval puts it squarely in the “definitely better than a placebo” category. And yet dozens more studies were done after that, showing the same effect of the therapy.
Bero says many of the studies were small, and done as marketing studies. A doctor would be asked to put 1-3 patients into the study, and the company would send out drugs for those patients. The point of the trial was to get the drug out to the community and get people talking about it, not primarily to test its effects. And so Bero says we would be better served by a few large, well-executed trials than by dozens of dinky marketing studies. (But then what would that do to her claim to “don’t ever believe a single study, even if I do it”? She’s a big fan of meta-analyses, which have their own problems.)
When a company develops a drug, they begin by working out its pharmacodynamics, pharmacokinetics, and do animal studies. Clinical trials come in three phases:
- Phase I is done on healthy people, to gauge its safety
- Phase II consists of small studies on sick people
- Phase III trials are very large studies on sick people.
After that, a New Drug Application is submitted to the FDA, where various personages make reports and recommend risk management plans – and in about 20% of cases, a review board looks at the application. Surprisingly, most drugs don’t get that scrutiny. Then material from the NDAs can be made available, although Bero and her colleagues have found that large sections are redacted – including the researchers’ conflicts of interest, and sometimes important data like inclusion criteria and risk management recommendations. The redacted material is supposedly information that will harm the company’s competitive advantage, but (according to Bero, and really, to common sense) that information needs to be made available.
And soon, much of it will be. Companies submitting NDAs are now required to register their trials at ClinicalTrials.gov if they want to be able to publish the results. A new law, Public Law 110-85, states that all “basic results” must be included, although adverse event reporting is still optional. “Basic results” include participant flow (how many people started, completed, dropped out, and were excluded from the trial), the characteristics of the population, values of outcomes, and a contact person who can answer questions about the study.
Bero’s group will soon publish a study (I’m on the edge of my seat) comparing the information in NDAs to the published accounts of the same trials. We got a sneak peek of that data, which looks very interesting.
Now that we know about bias in clinical trials, what can we do about it? The options include banning conflicts of interest (which is what the Cochrane Collaboration does), putting restrictions on what the companies can do (such as requiring them to report data to ClinicalTrials.gov), and disclosure of the conflicts. Disclosure is the current situation, and it’s difficult to enforce. Many studies don’t state conflicts of interest, and journals usually don’t require such statements. Disclosure doesn’t prevent bias, Bero says, and actually might make it worse – since researchers figure readers will be taking their results with a grain of salt.
In Italy, drug companies have to pay a certain amount of money to the Italian version of the FDA when they submit their applications. This money goes to fund areas “where commercial research is insufficient,” namely, orphan drugs; head-to-head comparisons; and safety studies. This approach might not be practical in our government structure (it helps that Italy has a national health service) but it sure sounds like a good idea.
Upon request, Bero gave some advice for patient advocacy, suggesting Consumers United for Evidence-based health care (CUE) as a resource. Key areas of concern are conflicts of interest on FDA advisory committees, and the openness of NDA data.
For writers, the critical questions to ask include
- Why was the research done – is this just a marketing study?
- Who controls the research?
- How did it get published?
- Transparency of the methods
- Funding, and researchers’ personal conflicts
- Are there unpublished data? Sometimes data points are excluded for interesting reasons.
When reading a study, Bero suggests two key places to look for evidence of bias. First, what subset of the results makes it to the paper’s conclusion (and abstract)? Is anything important missing? And secondly, she says it’s always worth doing the math of how many subjects were enrolled in the study, and on how many are still there at the data reporting stage?
Another interesting thing to look at is the results – are the researchers measuring an actual outcome, like a number of heart attacks or deaths, or are they using a “surrogate outcome” like the results of a lab test? Drugs can be approved on the basis of just those lab tests. Surrogate outcomes are the basis for much of the recommendations for cholesterol-lowering statin drugs, for example, but those drugs may not have real-world effects with any kind of reasonable NNT.