As the failures of psychology as a scientific discipline continue to mount, I increasingly worry that the field isn’t salvageable. It’s not simply that there are so many problems, but that so few in the field seem to have the drive to want to do much about it.
There has been an enormous amount of bad research and an unknown amount of fraud. Why unknown? Because there are inadequate efforts to actually figure out how much fraud there is, much less to do anything about it. See these remarks from Dorothy Bishop. Here is one relevant excerpt:
In recent months, I've become convinced of two things: first, fraud is a far more serious problem than most scientists recognise, and second, we cannot continue to leave the task of tackling it to volunteer sleuths.
I agree. I’d add that while some courageous people have spotted a few instances of fraud, this required overcoming massive hurdles. People have to notice suspicious results, graphs, data, or claims, then they have to follow up, then they have to do the work to figure out if fraud might be going on. This often involves asking people for raw data, people who routinely ignore such requests or are difficult to obtain complete information from.
All the while, one risks potential backlash and criticism. Universities drag their feet. The journals fraudulent papers are published in drag their feet. Everyone acts like you’re a huge party-pooping walking quaalude that is just being annoying and ruining everyone’s fun. What a ridiculous reaction. One would think universities would, if not enthusiastically, then at least quickly, efficiently, and openly handle potential fraud cases since, of all institutions, is it not their mandate to promote truth? And one would think journals wouldn’t want to retain fabricated data and bullshit claims in their journals.
So you not only need courage, but perseverance. Often a lot of it. Coupled with the tenacity not simply to wait out all the dragging feet, but the tenacity to reopen doors when everyone wants to shut them in your face. On top of that, you need the perceptiveness to spot the errors in the first place, the competence to adequately identify instances of fraud, the caution to do so as rigorously and carefully as possible, the social skills to navigate the fusillade of recriminations and objections you will encounter, the fortitude to know that when you expose one researcher that this will inevitably harm the careers of innocent people in that person’s orbits as collateral damage, and the ability to accept that your actions will further tarnish the reputation of the very discipline you have spent so many years in, knowing that this could cause you to endure alienation, isolation, and suspicion yourself. I saw one comment that suggested that those who may eventually take on the role of official data sleuths are especially likely to themselves be guilty of fraud. After all, where best to hide but in plain sight? The worst part is, I’m not sure that’s an unwarranted suspicion in the first place! It may be the inevitable cost of taking on such roles.
So it takes the convergence of a lot of qualities, and all for what? There is no pot of publications at the end of the rainbow for exposing one’s colleagues as frauds. So what, exactly, is the incentive? People shouldn’t have to risk martyring themselves to do the academic equivalent of reporting a crime. Indeed, it’s not clear why at least some of these cases shouldn’t be crimes. Researchers use taxpayer-funded grants that could be used to conduct serious research, and often enjoy taxpayer-funded salaries, then manipulate their data or fabricate it in its entirety, flood journals with falsehoods and lies, potentially harming people when interventions are devised on the basis of these results, waste other people’s time and money attempting to replicate or expand on these findings, and then, when they generate lots of flashy and wild results, they go on to exploit their lies and fabrications to advance their careers with book deals and talking gigs. When they are exposed, it undermines the integrity of the entire field, discouraging investment from governments and private entities, threatening progress in entire fields of research. The consequences of fraud are diffuse, but massive. Why isn’t this a literal crime?
And that’s another thing. As Bishop notes,
As Bishop observes,
To date, the response of the scientific establishment has been wholly inadequate.
Later adding that
The task of unmasking fraud is largely left to hobbyists and volunteers, a self-styled army of "data sleuths", who are mostly motivated by anger at seeing science corrupted and the bad guys getting away with it.
We can’t keep relying on “vigilantes,” but we’re left with few good choices. At present, with no official auditors of research, it's left to volunteers to tackle in their spare time. This is bad, because there’s little incentive to do so, and no clear and transparent structures, norms, or rules. It’s just the Wild West of data law. And people question the authority of “vigilantes,”: who gave them the right to go questioning others? This can leave people feeling targeted and singled out in unjust ways.
Yet if we were to attempt to establish an official body of data investigators, people may question their authority, and even their integrity. People may wonder at the personalities and motives of those who would voluntarily seek authority to investigate their colleagues. I’m not even sure they’re wrong to do so. Which means anyone taking on such a role may be taking on a burden of constant suspicion and doubt.
It’s a lose-lose situation, with few good options. What I find strange about this is that precisely those people who would be avid proponents of institutional review boards and general ethical oversight in a broad sense seem to be highly suspicious and critical of any institutions that would directly police how the sausage gets made.
Another concern is that scientists seem to insist fraud is rare, but it’s not clear there is much evidence one way or another. I think the situation we’re in is one in which we simply don’t know. But your standard estimates are often going to be noisy.
Much of what we catch is going to be the easiest and most blatant instances of fraud. How many people maliciously and intentionally p-hack their results? Even if they were caught p-hacking, they’ll always have plausible deniability: they could claim negligence or incompetence or motivated reasoning. Short of mindreading devices, there’d be no way to prove such cases. This reveals a more general problem: there is no categorical distinction between outright fraud and sheer incompetence or negligence. Consider Gelman’s commentary on the matter (the comment section here is good, too):
And don’t forget Clarke’s Law: Any sufficiently crappy research is indistinguishable from fraud. All the above problems also arise with the sorts of useless noise mining we’ve been discussing in this space for nearly twenty years now. I assume most of those papers do not involve fraud, and even when there are clearly bad statistical practices such as rooting around for statistical significance, I expect that the perpetrators think of these research violations as merely serving the goal of larger truths.
Fraud and bad research blur together at the edges, creating a murky zone of useless and terrible data for which no malicious motive could ever be convincingly established.
This is a bad situation. Behavioral scientists who also run YouTube channels like me are simply leaving or giving up. Consider this video from Pete Judo, “ I'm done making Behavioral Science videos,” the title of which gives a pretty good idea as to the content. Study after study fails to replicate, or fails to generalize in important or interesting ways when the account would only be of interest if it did. I encounter researchers claiming that the replication crisis is over, that the dark days of bad research are largely behind us. People seem to want to just get on with business as usual, and if this time around efforts to reform the field aren’t sustained, the field will backslide right back into the sloppy mess it always has been.
My general impression is that nobody really cares very much, and I sometimes feel hopeless myself. But then I remember how much this makes me angry, and I fill my tank up with anger-fuel and get back to it. How ridiculous is this?
Richard Van Noorden reports:
An unpublished analysis shared with Nature suggests that over the past two decades, more than 400,000 research articles have been published that show strong textual similarities to known studies produced by paper mills. Around 70,000 of these were published last year alone (see ‘The paper-mill problem’). The analysis estimates that 1.5–2% of all scientific papers published in 2022 closely resemble paper-mill works. Among biology and medicine papers, the rate rises to 3%.
This is ridiculous. 3%?! We’re in the middle of a lag between the uptake of tools like GPT to render such practices much easier, too. How bad is it going to be a year or two from now (assuming AGI doesn’t kill us all).
Bishop suggests we train students how to spot data fraud, and provides a good list of the sorts of things such training would entail. I endorse this idea, but I worry that such proposals, even if we began immediately implementing them, might be too little, too late. If psychology is to emerge from its current, dismal state, it will be, at best, years from now.
One concerning factor is that irreproducible research is naturally selected for by journals which look for interesting results, which are pretty much all journals.
I think one factor that drives this is the relentless pressure to publish in order to advance professional careers (and, to be honest, grant money). If you start imposing stringent requirements of truthfulness, then nobody is going to get tenure.