Effect analysis and related approaches
8.3 Defining effects and causal relationship
Donald Campbell (1916-1996) was a social scientist with expertise in psychology. He made landmark contributions to the study of causality in evaluation. He and his colleagues, William Shadish and Thomas Cook, published renowned works on causality and experimentation which are still foundational references in the field (Shadish et al., 2002). In their 2002 book, they stated:
We can better understand what an effect is through a counterfactual model that goes back at least to the 18th-century philosopher David Hume (Lewis, 1973, p. 556). A counterfactual is something that is contrary to fact. In an experiment, we observe what did happen when people received a treatment. The counterfactual is knowledge of what would have happened to those same people if they simultaneously had not received the treatment. An effect is the difference between what did happen and what would have happened. (as cited in Shadish et al., 2002, p. 5)
A counterfactual is something that cannot be observed directly (it is logically impossible to be in the treatment and non-treatment groups simultaneously), hence the need to create a reasonable approximation of counterfactuals (Shadish et al., 2002). For analyzing effects, evaluations are meant to create conditions of experimentation that allow for the identification of effects and ensures that the effects observed are attributable to the intervention.
How do we know if cause and effects are related? In a classic analysis formalized by the 19th-century philosopher John Stuart Mill, causal relationship exists if (1) the cause preceded the effect, (2) the cause was related to the effect, (3) we can find no plausible alternative explanation for the effect other than the cause. These three characteristics mirror what happens in experiments in which (1) we manipulate the presumed cause and observe an outcome afterwards; (2) we see whether variation in the cause is related to variation in the effect; (3) we use various methods during the experiment to reduce the plausibility of other explanations for the effect, along with ancillary methods to explore the plausibility of those we cannot rule out. […] Experiments explore the effects of things that can be manipulated, such as a dose of a medicine, the amount of a welfare check, the kind or amount of psychotherapy, or the number of children in a classroom. (Shadish et al., 2002, pp. 6-7)
However, nonmanipulable events or attributes require the creation of something other than experimental research design to assess the causal relationship which is harder to study. A diversity of research designs exists as alternatives to experimental approaches.
It is important to note that experimental research covers a wide range of designs and diverse contexts. Various terms are used to represent these distinct evaluative contexts:
- Efficacy refers to the measurement of effects under controlled conditions (as opposed to real-life conditions). In these controlled settings, all groups receive the same predetermined doses of the intervention.
- Effectiveness, on the other hand, measures the intervention’s impact under normal usage conditions, accounting for potential challenges such as differences in access or dosage due to typical usage (Champagne et al., 2011a).
Experimental research is often presented as the gold standard for studying attribution of effects. However, this type of research also has limits. In particular, the conditions of experimentation may reduce the ability to generalize results (Shadish et al., 2002).
For example, efficacy which is the measure of effects between the treatment group and the control group is generally different (and superior) as compared to the measure of effectiveness which is the difference of effects between the group of users and the non-users, measured in normal use conditions (Champagne et al., 2011a). This difference is primarily explained by issues of accessibility and variations in treatment adherence. Similarly, some have criticized experimental research for its frequent focus on white male participants, which can reduce the consideration of the characteristics of other population groups and hides the potential effects on groups not invited to the experiment.
“The strength of experimentation is its ability to illuminate causal inference” (Shadish et al., 2002, p. 18). In particular, experimental designs eliminate internal validity threats such as rival hypothesis (McDavid & Huse, 2019). “The weakness of experimentation is doubt about the extent to which that causal relationship generalizes” (Shadish et al., 2002, p. 18).