A Simple Solution to Separate Statistically Significant from Clinically Significant

Related articles

How can physicians, in the care of their patients, translate research findings into useful information? P-values suggest differences, not effects. But could there be a simple solution?

One of the significant problems for physicians is in translating research and guidelines into the care of their patients. This problem is compounded because much of the research is couched in terms of statistical significance, not "clinically meaningful differences." A treatment that reduces the blood pressure by 2-3mmHg may be statistically significant compared to its competition but makes little if any difference to the patient.

Physicians have attempted to get around this problem by looking at "effect size," – which looks at how much the treatment and control group overlap. A more accessible idea is the number needed to treat, the NNT. A low-value means say for a statin reducing a patient's LDL is good, a high NNT, say for aspirin preventing a heart attack, means the treatment is effective for just a few of many. And clinically meaningful also varies through the lens of stakeholders.

"healthcare consumers typically seek, short-term, identifiable benefits, such as symptomatic relief, while clinicians elevate clinical outcomes… often corresponding poorly to symptomatic relief."

For patients, they want to feel better now and later, although later is never as important as now. For clinicians, who want useful data today, chronic disease is often characterized by intermediary risk factors, that stand-in for the illness itself; for example, A1c level in glucose management or LDL in the management of cholesterol and its attendant impact of cardiovascular disease. Payers want "more bang for the buck" – what is the most efficacious treatment at the lowest cost. And payers are not just insurance companies; payers include government. For policymakers, there are "bridge" studies correlating a statistically significant clinical effect with direct and indirect health costs. But once again, what is statistically significant is not always clinically relevant. And all of this breeds a range of measurements including cost-per-life year or cost-per-quality-adjusted life-years (CQUALYs) – terms for experts in the field, not concepts easily accessed by patients and physicians.

"The FDA looks for 'substantial evidence' that a drug will do what it is labeled to do, although it does not define substantial evidence."

New drugs needn't be better; they need to be as effective, resulting in trials demonstrating non-inferiority. That sets the goal rather low, especially in contrast to a study designed to show superiority. All of this, of course, still begs the question because inferiority and superiority are an aggregate measure of the balance of a treatment's benefits and harms. It is easy when the benefits are identical, choose less harm, but how to proceed when the results are a mixture of good and bad, a real-world finding.

Finally, there is the challenge of mapping the study's population onto the patient across the table from you. In this area, we have seen improvement, and many guidelines now are written stratifying patients into their varying five or 10-year risks.

Health scientists have made progress in the area of p-hacking and other manipulations that can promote treatments with little benefit but statistical significance. These manipulations can, to a lesser degree, hide treatments with a benefit that shows no statistical difference because the study is underpowered to find that distinction. To reduce the temptation to find a positive result, or to misuse statistical analysis, most studies now state their outcome measures when they register the research before it begins. This registration offers the opportunity for the researchers to indicate the clinically meaningful difference they hope to find. 

With a predetermined meaningful difference in mind, the findings are relatively straightforward; the new treatment is clinically meaningful or not. The controversy may well move from the results to whether their choice of measure is worthwhile. But, I suspect that it will be far easier for a physician to communicate and a patient to understand that metric, then to continue to try and translate p-values, confidence intervals, and quality-adjusted-life-years.

 

Source: Importance of Defining and Interpreting a Clinically Meaningful Difference in Clinical Research JAMA Otolaryngology-Head and Neck Surgery DOI: 10.1001/jamaoto.2019.3744 and Improving the Quality of Reporting of Research Results JAMA Otolaryngology-Head and Neck Surgery DOI: 10.1001/jamaoto.2016.2670 and Defining a Clinically Meaningful Effect for the Design and Interpretation of Randomized Controlled Trials Innovations in Clinical Neuroscience 2013 May;10(5-6 Suppl A):4S-19S. PMID: 23882433; PMCID: PMC3719483.