An Unsettling Hint at How Much Fraud Could Exist in Science

Two experts on dishonesty are separately accused of tampering with data for the same research paper. Has this ever happened before?

By Jacob Stern

Illustration by Ben Kothe / The Atlantic. Source: Getty.

August 2, 2023

Updated at 5:24 p.m. ET on August 2, 2023

Two years ago, an influential 2012 study of dishonesty co-authored by the social psychologist and best-selling author Dan Ariely came under scrutiny. A group of scientists argued on their blog that some of the underlying data—describing the numbers of miles that a car-insurance company’s customers reported having driven—had been faked, “beyond any shadow of a doubt.” The academic paper featuring that study, which described three separate experiments and had five co-authors in all, was retracted not long after. At the time, Ariely said that the figures in question had been shared with him by the insurance company, and that he had no idea they might be wrong: “I can see why it is tempting to think that I had something to do with creating the data in a fraudulent way,” he told BuzzFeed, “but I didn’t.”

Had the doctoring been done by someone from the insurer, as Ariely implied? There didn’t seem to be a way to dispute that contention, and the company itself wasn’t saying much. Then, last week, NPR’s Planet Money delivered a scoop: The company, called The Hartford, informed the show that it had finally tracked down the raw numbers that were provided to Ariely—and that the data had been “manipulated inappropriately” in the published study. Reached by NPR, Ariely once again denied committing fraud. “Getting the data file was the extent of my involvement with the data,” he said.

That an expert on dishonesty would be accused of dishonesty was already notable. Paired with last month’s allegations that the Harvard Business School professor Francesca Gino—who also studies lying and is a frequent co-author of Ariely’s—is associated with falsified data for the very same 2012 paper, it’s downright bizarre. The analysis of insurance data from The Hartford appeared as “Experiment 3” in the paper. On the preceding page, an analysis of a different dataset—the one linked to Gino—was written up as “Experiment 1.” The scientists who say they discovered issues with both experiments—Leif Nelson, Uri Simonsohn, and Joe Simmons—dubbed the apparent double fraud a “clusterfake.” When I spoke with the scientific-misconduct investigator and Atlantic contributor James Heathers, he had his own way of describing it: “This is some kind of mad, fraudulent unicorn.”

Given reports (contested as they may be) that such an extraordinary beast exists, certain questions arise. For example, if the fraud is real, could this be a case of data-tampering in cahoots, or might it be nothing more than an odd and ironic coincidence? When I reached out to Ariely, he said he has never engaged in any research misconduct. “For more than 25 years and alongside dozens of esteemed colleagues and collaborators, I have conducted research that has resulted in more than 100 peer-reviewed papers,” he told me via email. “To be explicitly clear, I have never manipulated or misrepresented data in any of my work and have never knowingly participated in any project where the data or conclusions were manipulated or misrepresented.” Gino initially declined requests for comment. Since this article’s publication, her attorneys have filed a $25 million lawsuit alleging defamation against Harvard University, Nelson, Simonsohn, and Simmons; and Gino posted a new statement on LinkedIn. “I want to be very clear: I have never, ever falsified data or engaged in research misconduct of any kind,” it says.

If the mad, fraudulent unicorn is real—if two different scientists really did fabricate data for separate experiments that were published in the same paper—the scenario might well be unprecedented. Neither Heathers nor any other experts I spoke with could recall a single example of this kind. (Ivan Oransky, the editor in chief of Spectrum and a co-founder of Retraction Watch, told me that he thinks it has happened in the past, but he couldn’t recall anything specific.) If the 2012 paper on dishonesty does represent a case of coordinated misconduct, that would certainly be unnerving. But there’s no evidence it does, and a coincidental, overlapping fraud would, in a way, be cause for even greater concern. It suggests that scientific fraud is much more common than the number of known cases might lead one to believe.

Read: The Harvard expert on dishonesty who is accused of lying

The actual rate of scientific fraud writ large is mysterious, but there are some clues. One laborious review of more than 20,000 biomedical research papers found that 3.8 percent contained images with “problematic” data, more than half of which showed signs of “deliberate manipulation.” And according to a meta-analysis of 18 anonymous survey-studies conducted from 1985 to 2005, just under 2 percent of scientists admit to having fabricated, falsified, or modified data. That said, one can hardly expect every fraudster to self-identify as such, even anonymously. Why contribute to a result that could promote greater scrutiny of behavior like your own?

Further data on the problem are hard to come by, in large part because scientists rarely look for fraud in a systematic way, Heathers said. Nelson, one of the three psychologists who reported finding signs of tampering in the studies from the 2012 paper, told me that even delving into data from a single paper can be very time-intensive. His group, which investigates suspicious research for a blog called Data Colada, does this work not on behalf of any formal body, but rather as a sort of pro bono side hustle. (The Data Colada contributor Simonsohn co-authored a paper with Gino in 2013.)

The lack of interest from scientific institutions in identifying fraud has both led to and been reinforced by some starry-eyed assumptions, said Nick Brown, a psychologist whose own investigations of suspect research have led to numerous corrections and retractions. “There seems to be this idea that once you have a Ph.D., you are somehow a saint,” he told me. Then evidence of scientific misconduct emerges, and people act as though the unthinkable has happened.

A more skeptical posture has served Brown well in his own work as a data detective, as it has for the scientists behind Data Colada. When they set about reviewing Ariely’s work on the 2012 paper, a few quirks in the car-insurance data tipped them off that something might be amiss. Some entries were in one font, some in another. Some were rounded to the nearest 500 or 1,000; some were not. But the detail that really caught their attention was the distribution of recorded values. With such a dataset, you’d expect to see the numbers fall in a bell curve—most entries bunched up near the mean, and the rest dispersed along the tapering extremes. But the data that Ariely said he’d gotten from the insurance company did not form a bell curve; the distribution was completely flat. Clients were just as likely to have claimed that they’d driven 1,000 miles as 10,000 or 50,000 miles. It’s “hard to know what the distribution of miles driven should look like in those data,” the scientists wrote. “It is not hard, however, to know what it should not look like.”

One can apply a sort of mirror-image reasoning to the possibility of a double fraud in the dishonesty paper. The numbers of miles driven didn’t look the way the scientists assumed they should, so the scientists concluded that the data had been faked. Similarly, the number of fishy datasets in a single published paper doesn’t really fit with expectations. But in the latter case, it would be our assumptions that are off. If fraud is really very rare—if, say, less than 2 percent of scientists ever committed it even once in their careers—then an overlap in 2012 would be an implausible anomaly. But imagine that scientific misbehavior is a good deal more common than is generally acknowledged: If that’s the case, then “clusterfakes” might not be so unusual. Mad, fraudulent unicorns could be everywhere, just waiting to be found.

Jacob Stern is a staff writer at The Atlantic.

Sections

The Print Edition

An Unsettling Hint at How Much Fraud Could Exist in Science