This paper is a first look at how AlphaFold is impacting science, but this story will continue as AI tools develop and the research community adapts to new opportunities. There's a lot to learn here, so we would love to talk to anyone who is working on similar topics!
Posts by Ryan Hill
These results suggest that AI is facilitating new research, but progress is slow. A bottlenecks model of R&D production seems to be at play, where AI unlocks tasks, lowering the cost of new discovery, but downstream progress is slowed by tasks that remain difficult to automate.
Lastly, we study impacts on early stage drug discovery by looking at bioactivity experiments, a key first step in in assessing whether candidate drugs bind to target proteins. We see no evidence yet of a shift in attention toward previously-undetermined proteins.
We find a noticeable shift in attention toward previously unsolved structures that now have new structural insights available through AlphaFold. These papers are about a variety of topics, including protein function, process, and diseases.
Next we turn our attention to the broader field of protein science and consider how scientists in related areas are using AlphaFold insights. Here we compare publishing activity about proteins that had previously been experimentally solved vs. those that had not
Instead of AI substituting for experimental output, we see evidence of complementarities between experimental and AI methods. They are now using AlphaFold structures as templates to help solve new structures experimentally.
We see no evidence of this so far. Here we plot the number of new experimentally-determined structures in the PDB in the three years since AlphaFold. They continue to publish these structures and do so in good journals like Science, Nature, and Cell.
The first part of our empirical analysis focuses on AlphaFold as a task-automating technology. We ask: Has the introduction of AlphaFold started to displace experimentalists, whose (rather expensive) methods might be substitutable with AI output?
Between 2021 and 2022, DeepMind predicted the structure of every known protein and posted the structures publicly on an EMBL-hosted website. A huge trove of new structure insights immediately became available for public use. Exciting stuff!
In 2020, DeepMind introduced AlphaFold2 in the CASP competition, and provided the first AI structure predictions with near-experimental accuracy. It was heralded as a huge and unexpected leap forward, with many predicting a rapid acceleration in protein science and drug discovery
Proteins can be described as a 2d sequence of amino acids that fold together to make a 3d structure. It has long been hypothesized that computational methods could be used to predict 3d structures of proteins from just the sequence.
For decades, structural biologists have used expensive and difficult experimental methods like x-ray crystallography to solve protein structures. Even after half a century of effort, there are many proteins that have not been solved experimentally.
A quick preview of the results:
1. AlphaFold is complementing rather than substituting for experimental research
2. AlphaFold is stimulating new basic research about previously unsolved proteins
3. No evidence yet that it has had a noticeable impact on early-stage drug research
The paper is preliminary and our analysis is ongoing, but due to the excitement around the topic, we wanted to report what we've learned so far. Feedback is welcome! Paper link: carolynstein.github.io/files/papers...
There is great excitement about the potential for AI to reshape science, but so far very little empirical evidence about how that is (or is not??) happening in real time. I'm excited to share a new working paper with Carolyn Stein about the impact of AlphaFold on science ->
i'm never showing my kid tony hawk. they'll have to recreate the 900 from first principles
Let’s make it happen
*mulder’s eyes widen as scully hands him the ancient xylophone*
Banana ball brain
What a pull
Too busy hustling
You listed six different publishers that are operating at scale. That’s more competitive than many mature industries, and it doesn’t include the many non-profit and university publishers
The best way to think about academic publishing is that the creators are also the consumers. No one is “buying” our papers at the newsstand. Elsevier is basically a management company that gets a fee for running the peer-review process. If they charge too much, a competitor will replace them.
Henry Hudson to blame presumably
East River?
Love that 5% of people don’t care for maps on their phones. Give me the Rand McNally US Road Atlas or give me death!
He can ask the cash cab questions
Competitive rates!