Related content:
You wouldn't normally think that circumcision, a 4000-year-old practice, would be a hot news topic. You'd be wrong.
In early 2026, the UK Crown Prosecution Service drafted guidance classifying some non-therapeutic circumcisions as potential child abuse or harmful practice, triggering outrage among Jewish and Muslim groups. More recently, Belgian prosecutors indicted three mohels in connection with a long-running investigation into allegedly illegal circumcisions in Antwerp.
The renewed attention to circumcision comes only months after Robert F. Kennedy Jr. reignited a very different controversy by claiming that circumcised boys have “double the rate of autism.”
In support of the claim, Kennedy pointed to two observational studies that have circulated for years in anti-establishment medical circles: a 2015 Danish paper by Frisch and Simonsen and a 2013 population-based study by Bauer and Kriebel. Together, the papers helped fuel a narrative that statistical associations surrounding circumcision may point to a previously overlooked autism risk. It's nonsense. Here's why. [1]
Numbers can seem scary
The evidence behind Kennedy's claims rests largely on statistical measures called hazard ratios (HRs). In simple terms, an HR compares how often a particular outcome occurs in one group versus another over time. In the Danish study, one analysis reported an HR of roughly 1.46, meaning a 46% higher rate of autism spectrum disorder in the circumcised group, while another subgroup analysis reported an HR above 2, or roughly double the rate, for what the authors called “infantile autism.”
To those unfamiliar with statistics, these numbers may seem scary. But what is really scary is that they're even being used in the discussion of public health policy.
More importantly, these numbers did not come from randomized controlled trials (RCTs), the gold standard of medical evidence. Properly designed RCTs test causation by defining a clinical goal (endpoint) in advance and comparing outcomes between carefully balanced control and treatment groups.
Kennedy’s studies are both retrospective observational studies, which look backward through existing data searching for statistical associations. Such studies are useful for generating hypotheses, but they are far weaker than RCTs for establishing cause and effect because they are much more vulnerable to hidden factors (confounders) and to researchers repeatedly searching huge databases until a statistical pattern finally appears. More on that below.
But that doesn't mean they are
That does not mean observational studies are automatically useless. Such studies can sometimes generate valuable hypotheses that deserve further investigation. Observational studies have sometimes identified genuine dangers, including the links between smoking and lung cancer, asbestos and mesothelioma, and thalidomide and birth defects.
But observational studies are only a starting point. Hazard ratios are not scientific truth machines, and scary-looking statistical relationships do not automatically prove cause and effect. In large observational studies, hidden differences between populations (confounders), repeated searching through massive datasets, and plain old statistical chance can all generate numbers that look convincing but may ultimately be meaningless.
That is where the circumcision-autism studies begin to fall apart.
Correlation Is Not Causation
This is where the difference between observational studies and randomized controlled trials really stands out.
In a properly designed randomized controlled trial (RCT), researchers decide in advance exactly what they are looking for. The clinical endpoints are defined before the study begins, the treatment and control groups are deliberately balanced through randomization, and the statistical analysis is planned ahead of time. The entire structure is designed to answer a specific question: Did the intervention actually cause the observed outcome?
Retrospective observational studies work very differently. Researchers begin with enormous collections of existing medical records and population data, then search for statistical relationships between events that appear to move together. The problem is that, given enough searching, large datasets will inevitably produce impressive-looking correlations that are completely meaningless.
Researchers have humorously demonstrated this with absurd examples showing extraordinarily strong statistical relationships between unrelated events, such as shoe size and cancer risk, ice cream consumption and arthritis, or birth month and heart disease.
There are thousands of similar examples of bizarre but impressive-looking statistics creating the illusion of a meaningful relationship. Some are crazier than others. One famous example shows an almost perfect correlation between the number of babies given the name Eleanor over a 25-year period and the amount of electricity generated by Polish wind farms.
Graph adapted from Tyler Vigen, Spurious Correlations, https://www.tylervigen.com/spurious-correlations
The statistics appear overwhelmingly convincing. The letter “r” (called the Pearson correlation coefficient) measures how closely two variables move together, and here the value is an astonishing 0.993 — almost a perfect correlation. The term “r²” (the coefficient of determination) is derived from r and measures how much of the variation in one variable can statistically be accounted for by the other; in other words, it estimates how well knowing one trend allows you to estimate the other. Finally, the “p-value” measures the probability that such a strong relationship could arise purely by chance, and here p < 0.01 suggests the odds are very small.
Yet despite these remarkably powerful statistical signals, there is obviously no plausible connection between babies named Eleanor and wind power generation in Poland. The graph comes from Tyler Vigen’s famous "Spurious Correlations" project, designed specifically to demonstrate how meaningless variables can generate spectacular statistical relationships, especially when a huge number of events from multiple databases are involved.
As Vigen explains:
“I have 25,153 variables in my database. If I compare each variable against every other one, I end up performing more than 632 million correlation calculations. This is called ‘data dredging.’ Instead of starting with a hypothesis and testing it, I abuse the data to see what correlations shake out. The danger is that any sufficiently large dataset will inevitably yield strong statistical relationships purely by chance.”
Tyler Vigen, Spurious Correlations, https://www.tylervigen.com/spurious-correlations
That is precisely the danger facing retrospective observational research.
The HRs Are Not Especially Strong. Neither are the claims.
The headline HR in the Danish circumcision paper was roughly 1.46 for autism spectrum disorder. That may sound impressive, but truly powerful cause-and-effect relationships, like smoking and lung cancer (HR ~ 20),\ produce much larger numbers.
More importantly, modest HRs from observational studies are especially vulnerable to hidden differences between groups (confounders), diagnostic bias, incomplete data, and plain old statistical chance.
That matters here because ritual circumcision in Denmark is strongly associated with specific religious and immigrant populations that differ from the general population in many ways unrelated to circumcision itself. Differences in healthcare access, socioeconomic status, cultural attitudes toward diagnosis, and developmental screening can all influence autism statistics independently of circumcision.
The strongest HR reported in the study also came from a smaller subgroup analysis involving “infantile autism” in children under five. This is a big red flag. The more researchers subdivide data into smaller and smaller subgroups, the greater the likelihood that one subgroup will eventually produce an alarming-looking statistical result purely by chance. For example, had the researchers been looking for “infantile autism” in children under three, it is entirely possible that the apparent association might never have appeared at all.
The Danish study faced another serious problem: incomplete data. Many ritual circumcisions in Denmark occur outside the hospital registry system used by the researchers, a limitation the authors themselves acknowledged. If researchers cannot reliably determine who actually belongs in the circumcised group, the resulting hazard ratios become far less trustworthy and may ultimately be meaningless.
Defenders of these kinds of studies often point out that the results achieved “statistical significance.” [2] But in very large databases, statistical significance can be surprisingly easy to obtain. With enough comparisons, subgroup analyses, and searches through massive datasets, researchers will inevitably uncover impressive-looking statistical relationships that are ultimately meaningless, much like Eleanor and Poland.
That does not make observational studies useless. Such studies can sometimes generate valuable hypotheses that deserve further investigation. Observational studies have sometimes identified genuine dangers, including the links between smoking and lung cancer, asbestos and mesothelioma, and thalidomide and birth defects. But observational studies alone are a very weak foundation for sweeping public-health claims.
The studies RFK Jr. cites are demonstrably weak. The link between circumcision and autism is patently ridiculous. Just ask Eleanor. Last I heard, she was in Poland.
NOTES
[1] Neither of the two studies Kennedy cited actually measured Tylenol (acetaminophen) exposure. The connection between circumcision, Tylenol, and autism is speculative and not directly tested in either paper.
[2] Statistical significance does not mean that a result is scientifically important or that cause and effect have been proven. It usually means only that, under a particular statistical model, the observed relationship would be expected to arise by chance less than 5% of the time (p < 0.05).
Josh Bloom
Director of Chemical and Pharmaceutical Science
Dr. Josh Bloom, the Director of Chemical and Pharmaceutical Science, comes from the world of drug discovery, where he did research for more than 20 years. He holds a Ph.D. in chemistry.
