You’ve made it to part five of Lecture 3, so you can probably already guess how indirect costs and neoliberal university rent-seeking shape research and its associated literature. But perhaps we can share a catharsis in one final reckoning with the context of discovery.
As Meehl puts it, “You don't know, when you look at what has surfaced, to what extent the experiments in a given domain came to be performed because of the financial and other pressures upon academicians.” Explicitly or implicitly, academic scientists have to raise funds to do their research, and such funds are scarce. Consequently, faculty will work on only what they think can be funded.
As any young investigator knows, federal funding has become competitive. Proposals at the National Science Foundation or the National Institute of Health are fiercely reviewed by other scientists and infrequently awarded. This review process means that a scientist might choose to play the odds, jumping on every academic trend and throwing their hat in every call for proposals they come across. It also means that scientists must spend their time marketing their ideas to convince their peers their work deserves one of the few rare, prestigious grants.
Meehl argues that funding scarcity also shapes the sorts of projects that people propose, forcing faculty to favor expedience over curiosity. Scientists are compelled to choose the cheapest path to a result. This means that the particular method appearing in a paper is frequently the cheapest, not the most scientifically appropriate.
Well-funded projects come with their own special problems. If a sponsored project grows too large in cost, it becomes too big to fail. If you run some field study with hundreds of staff and hundreds of thousands of participants, the massive expenditure compels you to find some evidence that your intervention did what you said it would do. Does this mean that investigators write up the results of big projects in a way to save face? Does this mean there are incentives to continue to look for evidence of results that aren’t quite there? You’ll have to be the judge when you read such papers.
And what about projects that go against a party line? Are these less likely to be funded? Meehl recounts witnessing overly zealous scrutiny applied to edgier proposals that were out of favor and that scientists “feel they've got to research what the bureaucrat in Bethesda wants researched.” I know many still believe this to be true. Even such circumstantial evidence adds doubt to one’s assessment of the scientific literature.
Meehl doesn’t discuss this, but we obviously run into similar problems with non-governmental funding sources. Gifts from philanthropists are targeted at pet causes. Gifts from industry are dependent on industrial interests.
You might think that we should look outside the academy for less harried investigations, but industrial research, which has been growing steadily in computer science for the last decade, has its own biases. There is unquestionably a filter on the questions asked by researchers who work in industry. Industrial papers have to pass internal corporate review before being published. There have been notable blow-ups of people getting fired from industrial labs for not towing party lines.
Now, patronage has always been part of science, but there is something particularly pernicious about our contemporary model built around constant, vicious competition. As I mentioned in passing, the constant competition with peers for scarce funds means scientists are constantly marketing, and this mindless scientific marketing may be the most damaging aspect of all of this.
Every proposal, paper, and presentation becomes a marketing promotion. The reader has to work through a startup pitch before getting to the main findings. If a clinician or practitioner knows that every publication is a sales document, their interpretation of every result becomes more critical and suspicious.
David Graeber points to this marketing, which has “come to engulf every aspect of university life,” as a primary source of stifled innovation. In his essay “Of Flying Cars and The Declining Rate of Profit” from his 2015 collection The Utopia of Rules, he asks why progress in science seems to have slowed since 1970. In academia, he calls out marketing as a central pernicious force:
“There was a time when academia was society's refuge for the eccentric, brilliant, and impractical. No longer. It is now the domain of professional self-marketers. As for the eccentric, brilliant, and impractical: it would seem society now has no place for them at all.”
Graeber concludes that when scientists spend their time marketing, competing with their peers, and choosing expedience over curiosity, we end up in a world of scientifically overproduced incrementalism.
“That pretty much answers the question of why we don’t have teleportation devices or antigravity shoes. Common sense dictates that if you want to maximize scientific creativity, you find some bright people, give them the resources they need to pursue whatever idea comes into their heads, and then leave them alone for a while. Most will probably turn up nothing, but one or two may well discover something completely unexpected. If you want to minimize the possibility of unexpected breakthroughs, tell those same people they will receive no resources at all unless they spend the bulk of their time competing against each other to convince you they already know what they are going to discover.
“That’s pretty much the system we have now.”
Welp. On that cheery note, we’d better get back to the tidy abstractions of philosophy next post…
The product of science is the result of a communal effort. So the incentives that affect individual scientists don't necessarily affect in the same direction the product over time as a whole. In particular, even though it doesn't pay to go against the tide, some scientists do it, and a few produce major new findings (and this is also sometimes rewarded with major prizes). Their impact on science may be larger than the many others who do follow the fashion of the day.
Part of the issue is that there are a lot of scientists, probably more than the optimal number from the point of view of optimizing the allocation of resources to advance science. This is due to the proliferation of higher education and the connection between teaching and research. A lot of science has to be exploration of relatively low importance, but perhaps there's too much of that, beyond the level that is necessary to facilitate great discoveries.
Three points:
1. ML research clearly has a very serious problem with treating research as marketing and idea promotion.
2. I agree that perverse incentives are almost certainly the underlying cause.
3. I would argue that there are other important incentives, in addition to funding, that influence the structural bias towards positive results.
To flesh these points out a bit more:
Prior work has used "percentage of papers reporting positive results" as a proxy for the amount of bias within a research field (see “Positive” Results Increase Down the Hierarchy of the Science, Daniele Fanelli, 2010). Fanelli looks at 2000+ papers in 20+ fields, and finds that the percentage of papers increases for the "softer" sciences; psychology has the highest percentage of positive results with about 90%. I then took a random sample of 400 papers in my subfield (using ML to solve PDEs) and found that of the 400 papers in the random sample whose abstracts mention positive and/or negative experimental results, 95% (220/232) mention only positive results, 5% (12/232) mention both positive and negative results, and 0% (0/232) mention only negative results.
The percentage of positive results in (my subfield of) ML appears to be much higher than in any other field of science! This of course doesn't prove anything (not the same experimental design, not a perfect proxy, etc), but it does suggest that ML has a really serious issue with researchers interpreting and presenting their results in biased ways so that they can market a paper with a "positive" result.
I wrote about incentives in a paper (under review):
"We emphasize both good intentions and perverse incentives as explanations for the apparent bias towards positive results. The culture of scientific ML is one in which well-intentioned researchers try to figure out ways that ML might be useful for science. In the process of doing so, they tend to be less interested in reporting ways that ML isn’t useful. Perverse incentives also contribute. Because ML research rewards novel ideas and positive experimental results, all else being equal articles with weak baselines and/or reporting biases are more likely to get accepted to prestigious venues and more likely to be widely cited. Incentives against negative results are particularly strong in scientific ML, because career advancement (in academia) and lucrative jobs (in industry) depend on the presumption that ML will be a useful tool for scientific problems. Negative results could cast doubt on that presumption, thereby undermining justification for one’s research area."