6 Secondary outcomes: Can conclusions be made from outcomes other than the primary one?

Most trials designate one outcome as a "primary" outcome (or 2-3 “co-primary outcomes”) and all other outcomes as "secondary" outcomes. Designation of an outcome as “primary” is done to determine and justify sample size calculations prior to conducting a study. In other words, the primary outcome is not necessarily the most clinically important (it often isn’t), and should not be the sole consideration as to whether an intervention is “better” than a comparator.

The interpretation of secondary outcomes requires additional considerations. The probability of finding a difference simply due to chance increases as the number of outcomes increases.

Checklist Questions

Are we trying to find a difference in a secondary outcome when there was no statistically significant difference between groups for the primary outcome?
Was the secondary outcome one of a small number of secondary endpoints defined in the original protocol? If there was a positive finding, were there appropriate statistical adjustments made?
Does the secondary endpoint result make sense in the context of the primary (and other secondary) outcome findings?
Was there an unexpected positive finding for a rare outcome?

Data-mining: Are we trying to find a difference in a secondary outcome when there was no statistically significant difference between groups for the primary outcome?

Authors may emphasize statistically significant differences in secondary outcomes when they fail to find a statistically significant difference in the primary outcome.

For example, a review (Khan MS et al.) of 93 cardiovascular RCTs found that spin (i.e. reporting strategies highlighting benefits despite a non-statistically significant primary outcome) was present in 57% of abstracts and 67% of main texts.

E.g. In the FIELD trial (Keech A et al.), the difference in the primary outcome (coronary events up to 5 years) was not statistically significant between fenofibrate and placebo in patients with type 2 diabetes. In their conclusions, authors highlighted marginally statistically significant reduction in 3 of 9 secondary efficacy outcomes (total cardiovascular events, non-fatal myocardial infarction, and revascularization).

Minimizing multiplicity: Was the secondary outcome one of a small number of secondary endpoints defined in the original protocol? If there was a positive finding, were there appropriate adjustments made?

More comparisons increase the risk of finding a difference when there is none, as depicted:

Graph 1. Probability of at least one false positive result by number of outcomes tested assuming no difference and threshold for statistical significance <0.05.

Depending on the context, it may be justified to adjust for multiplicity when considering multiple outcomes. Adjusting for multiplicity is a statistical method of requiring lower p-values to account for multiple comparisons. There are multiple methods for calculating the stricter margin (Bonferroni test, Holm test, etc.). There is no consensus on when to adjust for multiplicity, but the following can act as general guidance:

Table 10. Circumstances where adjusting for multiplicity may be necessary.
Circumstance Whether Adjustment is Necessary
At least one outcome is positive and the outcome is intended to inform future research rather than be incorporated directly into clinical practice (i.e. exploratory) Adjustments in the analysis are not warranted as such findings are used only to generate hypotheses
At least one outcome is positive and the outcome is intended to directly inform clinical practice (i.e. confirmatory) Adjustments may be necessary if:
– More than one dose is compared
– More than one primary outcome is used
– The primary outcome was assessed in multiple different population

Consistency: Does the secondary endpoint result make sense in the context of the primary (and other secondary) outcome findings?

Outcomes with similar pathophysiology (e.g. myocardial infarction and ischemic stroke with antihypertensive agents) should move in the same direction (both increased or both decreased), whereas outcomes with opposing pathophysiology (e.g. myocardial infarction and bleeding with antiplatelets) should move in opposite directions.

E.g. In FIELD (Keech A et al.), the secondary outcome of non-fatal myocardial infarction was statistically significantly less with the fenofibrate group compared to placebo. However, all-cause mortality, coronary death, deep vein thrombosis, and pulmonary embolism occurred more frequently in the fenofibrate group.

Was there an unexpected positive finding for a rare outcome?

One should be skeptical whenever an unexpected statistically significant reduction is found in a rare secondary outcome, particularly when there is no difference in the primary outcome.

E.g. The ELITE trial (Pitt B, Segal R, et al.) comparing losartan to captopril in 722 elderly heart failure patients failed to find a significant difference in the incidence of the primary outcome, increase in serum creatinine (11% in both groups). There was, however, an unexpected reduction in all-cause mortality with losartan vs. captopril (5% vs 9%, p=0.04). The follow-up ELITE II trial (Pitt B, Poole-Wilson PA, et al.) with its larger sample of 3152 patients and a primary outcome of mortality found no reduction in – and in fact numerically higher – mortality with losartan vs captopril (18% vs 16%, p=0.16).
definition

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

NERDCAT Copyright © 2022 by Ricky Turgeon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book