RCTs are considered the best way to determine whether a treatment is effective, and the reason is simple: researchers can hold all other variables constant across groups and study what happens when one variable is changed. As such, RCTs are powerful and extremely important. However, they are not the only way to collect and deposit data into a scientific body of evidence. As with any study design, RCTs have limitations. Thus, they should not be the sole basis for scientific evidence and decisions.
Feasibility
RCTs are not always feasible. There are many real-world factors that prevent RCTs from being used to answer every question about effectiveness.
- Prolonged Outcomes: If the question is “Can X cause cancer?” Cancer often takes years to develop, possibly decades.
- Ethics: Sticking with the cancer example, if you suspect that an exposure might lead to cancer, it would be unethical to randomize people into groups where some of them get the exposure while others don't. There are other study designs better suited to address this conundrum, such as natural experiments and case-control studies.
- Cost: RCTs are expensive to conduct.
- Not Realistic: Even when RCTs can be designed, they may not realistically reflect real-world treatments. For example, people may not take their medications as they would in an RCT; doses may be missed or differ from the instructions. Additionally, RCTs are often small and do not reflect the broader population.
Interpretation Drawbacks
Not every RCT is well designed. As with all scientific endeavors, the devil is in the details. Looking at the methods of an RCT may seem technical and overwhelming; however, understanding what was studied and how it was studied is incredibly important for drawing conclusions.
This problem was demonstrated in Parachute Use To Prevent Death And Major Trauma When Jumping From An Aircraft: Randomized Controlled Trial. Participants were randomly assigned to jump from a plane wearing either a parachute or a regular backpack. The study concluded that:
“Parachute use did not reduce death or major traumatic injury when jumping from aircraft in the first randomized evaluation of this intervention.”
Why? The participants jumped from just a couple of feet off the ground from a stationary airplane.
The researchers didn't hide this. It was in their methods section and was, in fact, the point of the study. They reported statistically significant results for meaningful outcomes, using the gold standard RCT to investigate their question. However, the question is not really relevant to parachute use, since you would never want to deploy a parachute if you were only jumping down a couple of feet. While critical appraisal of study methodology is technical, it is essential for understanding what can be concluded from any study, even an RCT.
Insisting on studying everything under the sun with an RCT is not feasible. Those calling for it are willfully ignoring the drawbacks of RCTs and pretending that if we could just do RCTs to answer every question, then we would know all the answers. The reality is, even if we could do RCTs to investigate every single question, we still wouldn't have good answers. RCTs have their own limitations and inherent drawbacks.
“The PARACHUTE trial does suggest, however, that their accurate interpretation requires more than a cursory reading of the abstract. Rather, interpretation requires a complete and critical appraisal of the study.”
Other Relevant Questions And Study Designs
Scientific evidence and consensus are built on a plethora of data and study designs, not just RCTs. While RCTs are the gold standard for effectiveness, efficacy isn't the only relevant question when developing a treatment or intervention. RCTs answer the question, “Does this work in an ideal setting?” However, “does it work in the real world?” and “does it cause problems?” are also extremely important to study.
Designs such as natural experiments, case-control studies, and post-market monitoring are better suited to specific questions about long-term safety and outcomes.
- Natural experiments can answer both “does it work in the real world” and “does it cause problems?”
- Case-control studies can help with “Does this seem to be safe over time?”
- Post-market monitoring can help answer whether a treatment/intervention is effective in the real world and if it's safe.
While observational study designs have limitations, they provide important information that cannot be gleaned from RCTs. Case-control studies and retrospective cohort studies were purposely designed to address some of the limitations of RCTs. For example, a case-control study was responsible for revealing that smoking can cause cancer. Studies with these designs can provide a different picture of the broader context in which people, treatments, and interventions operate.
RCTs and Vaccines
Vaccines provide a perfect case study for how RCTs pair with other types of studies to inform a larger body of evidence. In the United States, vaccines undergo multiple rounds of RCTs and are then subject to post-market monitoring. VAERS is the best-known of these post-market monitoring efforts.
“Vaccines, like other pharmaceutical products, undergo extensive testing and review for safety, immunogenicity, and efficacy in trials with animals and humans before they are licensed in the United States. Because these trials generally include a placebo control or comparison group, it is possible to ascertain which local or systemic reactions were due to the vaccine…The continuous monitoring of vaccine safety in the general population after licensure (known as post-licensure or post-marketing surveillance) is used to identify and evaluate risk for such AEs after vaccination.”
--CDC, Manual for the Surveillance of Vaccine-Preventable Diseases
We know from RCTs that vaccines help prevent deaths and may prevent infection altogether. From observational data, we know that vaccines are safe across time, and for roughly how long they are effective. While RCTs play a vital part in this process, they cannot do it alone. Health sciences research consists of multiple types of study designs that all play a specific part in helping to determine what is helpful, effective, and safe. A comprehensive scientific evidence base requires a diverse portfolio of studies, with each study design being the "gold standard" for the questions it is best suited to answer. Ignoring other types of evidence simply because they aren't RCTs is a disservice to research and public health overall.
This is the pattern that is followed for other treatments/interventions. From HRT to statins and beyond. Evidence is layered to create a more complete picture. RCTs are used to examine the introduction of a treatment/intervention. They are used to assess safety and effectiveness in the short term and in a controlled environment. Observational data are then collected to assess real-world, long-term efficacy and safety at scale.
Ultimately, insisting that every question be answered by an RCT misunderstands both the strengths and the limits of the method. No single study design can carry the entire weight of public health and biomedical research. The real world is too complex, too varied, and too messy to be captured by a single method. By valuing diverse forms of evidence, each suited to different questions, we weave together a richer, more accurate picture of how interventions perform where it matters most—in everyday life. That is the foundation of responsible, evidence-based practice.
