Non-inferiority trials: Was the intervention compared to see if it is “no worse” than an established therapy?

Ricky Turgeon; Blair MacDonald

9 Non-inferiority trials: Was the intervention compared to see if it is “no worse” than an established therapy?

Most commonly trials test for superiority i.e. determining whether an intervention is superior to some comparator with respect to the primary outcome. Conversely, the objective of a non-inferiority trial is to test whether an intervention is “not much worse” than a comparator (usually the current standard of care) with regard to the primary outcome. The rationale for a non-inferiority design is that the new treatment offers some benefit other than increased efficacy, such as being safer, more affordable, or more convenient. While the fundamentals of non-inferior trials are similar to that of superiority trials, there are some unique concepts necessary when critically appraising them.

Non-Inferiority Margins

The non-inferiority margin is closely related to the minimally important difference, which is the smallest difference in the effect on an outcome that would be meaningful to a representative group of patients. The non-inferiority margin is the yardstick by which non-inferiority is defined, and is selected during the design of a non-inferiority trial. If the CI of the difference between the intervention and comparator crosses the non-inferiority margin, the intervention is deemed to not be non-inferior to the comparator. For example, consider a non-inferiority margin is a RR of 1.2 for stroke, and the actual RR is 0.9 with 95% CI 0.5 to 1.3. Since the observed upper end of the CI (1.3) is greater than the non-inferiority margin (>1.2), the conclusion is that the treatment is not non-inferior. If the upper end of the CI had been 1.1, the conclusion would be that the treatment is non-inferior given that 1.1 < 1.2.

Intuitively this should be equivalent to the minimally important difference, and ideally this is the case; however, researchers may choose a more “generous” non-inferiority margin (i.e. one that allows a difference greater than the minimally important difference to be considered “not much worse”).

See the graphical depiction of concept below:

Plot 6. Graphical depiction of non-inferiority and related concepts.

Superiority and inferiority (consider the line of no difference):

The treatment is considered superior when the upper end of the CI is below the line of no difference (0 in this case).
The treatment is considered inferior when the lower end of the CI is above the line of no difference.

Non-inferiority and not non-inferiority (consider the non-inferiority margin):

The treatment is considered non-inferior when the upper end of the CI falls to the left of the non-inferiority margin.
The treatment is considered not non-inferior when the CI crosses to the right of the non-inferiority margin.

See Mulla SM et al. for more information on the questions asked below. See Hong J et al. for information concerning deficits in non-inferiority trial reporting.

Checklist Questions

Is a non-inferiority design justified by some other advantage of the intervention versus the comparator?

Did the trial use a non-inferiority margin based on a relative or an absolute risk difference?

Is the non-inferiority margin well justified based on statistical reasoning and clinical judgment?

Is the non-inferiority margin strict enough according to your own judgment?

Was non-inferiority demonstrated in both intention-to-treat (ITT) and per protocol analyses?

Was the comparator appropriate?

Has the active comparator demonstrated unequivocal superiority over placebo in previous trials?

Was the effect of the comparator in this trial consistent with that of previous trials?

Is the non-inferiority design justified by some other advantage of the intervention versus the comparator?

If the intervention is non-inferior but not superior, it should have another meaningful advantage that justifies considering it for your patients. Consider and quantify:

Fewer, less frequent, or less-severe adverse effects
Fewer drug interactions
Easier to take
Less intensive or less invasive monitoring required
Lower cost

Note: The advantage of the non-inferior intervention should not be included in the primary outcome being tested for non-inferiority. This biases the results in favor of the new intervention.

E.g. In PRAGUE-17 (Osmancik P et al.), a RCT comparing percutaneous left atrial appendage occlusion (LAAO) with direct-acting oral anticoagulants (DOACs) in patients with atrial fibrillation and a history of bleeding, the primary outcome was a composite of:

Ischemic or hemorrhagic stroke
Transient ischemic attack
Systemic embolism
Cardiovascular death
Procedure-/device-related complications
Major or non-major clinically relevant bleeding

However, since the justification to see if LAAO was non-inferior to DOACs was that LAAO may offer a lower risk of bleeding, it was inappropriate to include bleeding in the primary outcome being tested for non-inferiority. Indeed, bleeding events accounted for nearly half of all primary outcome events, and excluding these would not allow for the conclusion of non-inferiority (LAAO would be “not non-inferior” to DOAC).

Did the trial use a non-inferiority margin based on a relative or an absolute risk difference?

Non-inferiority margins based on absolute risk scales can falsely conclude an intervention to be non-inferior if event rates are lower than expected, which commonly occurs
Relative risk non-inferiority margins are more conservative – and therefore preferable – as they scale to the incidence of outcomes in the trial

E.g. In SPORTIF V (Albers GW et al.), the intervention was non-inferior according to the absolute risk difference non-inferiority margin of 2%, but it would not have been non-inferior if a relative risk non-inferiority margin of 1.67 – based on the same previous study data – had been used. The discrepancy was caused by a lower-than-expected event rate of 1.2% in the warfarin group (vs. expected 3.1%).

Is the non-inferiority margin well justified based on statistical reasoning and clinical judgment?

A trial’s non-inferiority margin should be justified on the principle that the intervention being studied is (1) “not much worse” than (non-inferior to) the comparator, and (2) still better than nothing/placebo. Rules for an appropriate non-inferiority margin:

Defined prior to undertaking the trial
Justified relative to the minimal important difference (previously termed the minimal clinically important difference), which should be defined based on prior evidence
Preserve the effect of the standard treatment over placebo

E.g. #1 In RE-LY (Connolly SJ et al.), a RCT comparing dabigatran to warfarin for the prevention of stroke and systemic embolism in non-valvular atrial fibrillation, the pre-specified non-inferiority margin was a relative risk of 1.46. This was based on half the “worst case” end of the CI for benefit with warfarin vs. placebo. In other words, if RE-LY proved non-inferiority of dabigatran, it would, at its very worst, be ~2/3 (1÷1.46) as good as warfarin for this outcome.

E.g. #2 In RESET (Kim B-K et al.), a RCT comparing 3 months vs. 12 months of clopidogrel (added to aspirin) following drug-eluting stent placement, the non-inferiority margin was set as an absolute risk difference of 4% without rationale. At the expected control-group event rate of 11%, this would allow for a “worst case” relative risk reduction of 43%. For comparison: in CREDO, the addition of clopidogrel to aspirin vs. aspirin alone reduced the primary outcome by only an absolute 3% (relative risk reduction of 27%) in a similar population. In other words, the chosen non-inferiority margin allowed for the shorter course of clopidogrel to be similar to or worse than placebo.

Is the non-inferiority margin strict enough according to your own judgment?

Ultimately, you as the reader need to decide for yourself if the non-inferiority margin is reasonable and acceptable.

E.g. #1 In ROCKET-AF (Patel MR et al.), a RCT comparing rivaroxaban to warfarin in patients with atrial fibrillation with the primary outcome of stroke or systemic embolism, the non-inferiority margin was 1.46. Given the actual rate of occurrence of this outcome in the warfarin group (2.2 events per 100 patient-year), a 1.46 margin would have amounted to an increase of ~1 event per 100 patient years. As such, this is a reasonable non-inferiority margin.

E.g. #2 In PRAGUE-17 (Osmancik P et al.), the non-inferiority margin was such that it allowed for 5% absolute risk increase in the primary outcome (stroke, transient ischemic attack, systemic embolism, cardiovascular death, major or nonmajor clinically relevant bleeding, or procedure-/device-related complications) with LAAO versus DOAC. Many clinicians and patients would consider a 5% absolute increase in this composite (which includes the purported advantage of less bleeding with LAAO versus anticoagulation) to be clinically important and therefore reject non-inferiority of LAAO based on this margin.

Note that the non-inferiority margin refers to an acceptable boundary for the “worst case” end of the CI, not the point estimate itself.

Was non-inferiority demonstrated in both intention-to-treat (ITT) and per protocol analyses?

As is the case with superiority trials, ITT analysis is preferred as the primary analysis as it preserves the advantages of randomization and minimizes attrition bias. However, ITT may attenuate outcome differences between groups and make it easier to demonstrate non-inferiority.
Per-protocol analysis aims to isolate the effect of the intervention by excluding patients who did not receive study treatment “per-protocol”, such as patients who dropped out or received the intervention intended for the other treatment group (“crossover”). In many cases, dropouts and crossovers are due to intervention inefficacy/intolerance and/or associated with patient prognosis, which introduces bias. Some falsely believe that this makes the per-protocol analysis the more conservative analysis for non-inferiority trials; however, that is only the case if the bias that is introduced favors the comparator. In other words, using the per-protocol analysis where protocol violations or crossovers occur more frequently in the comparator group will bias the results in favor of concluding that the intervention is non-inferior.
In most cases, discrepancies between ITT and per-protocol analyses suggest that bias has been introduced into the trial. As a general rule, non-inferiority should only be accepted/concluded if it is demonstrated in both the ITT and per-protocol analysis.

In a systematic review of 231 non-inferiority RCTs published in five high-impact journals from 2005 to 2014, only 45% of non-inferiority RCTs reported both ITT and per-protocol analyses. When both were reported, discrepancies between analyses (in terms of demonstrating non-inferiority) occurred in 6% of comparisons. Neither analysis was consistently more conservative, with the ITT being more conservative in 50% of discrepancies (Turgeon RD, Reid EK, et al.).

Was the comparator appropriate?

The comparator intervention should:

Be consistent with the current standard of care. This can be assessed by scanning local institution policy and/or national guidelines
Be more effective than nothing/placebo. This can be assessed by scanning tertiary references such as DynaMed and UpToDate for high-quality evidence demonstrating clinically important benefits of the comparator
Have an effect that is consistent with that of previous trials

E.g. In RE-LY (Connolly SJ et al.), the yearly incidence of stroke in the warfarin group was 1.6%. In a meta-analysis of older trials, the yearly incidence of stroke was 2.2%. This indicates there may be differences in the warfarin administration, monitoring, or population studied compared to previous trials.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Non-Inferiority Margins

Superiority and inferiority (consider the line of no difference):

Non-inferiority and not non-inferiority (consider the non-inferiority margin):

Checklist Questions

Is the non-inferiority design justified by some other advantage of the intervention versus the comparator?

Did the trial use a non-inferiority margin based on a relative or an absolute risk difference?

Is the non-inferiority margin well justified based on statistical reasoning and clinical judgment?

Is the non-inferiority margin strict enough according to your own judgment?

Was non-inferiority demonstrated in both intention-to-treat (ITT) and per protocol analyses?

Was the comparator appropriate?

License

Share This Book