π¨ 2026 Lilly x Nucleate Grand Challenge: Aging Reimagined
π¬ $100K non-dilutive
ποΈ Pitch at Lilly HQ
π€ Lilly's science + venture teams
Focus: mobility, cognition, immune resilience, regenerative medicine β the frontiers of healthspan.
π
May 15
π linktr.ee/lillygrand...
Posts by Jeremie Kalfon π¨βπ»π§¬π€π
The full dataset is open, including an interactive atlas you can explore right now online. We also released the data pipeline so you can reproduce or extend it.
I just wrote a blog post about it: π www.jkobject.com/pro...
3/3
350 million cells. 16 eukaryotic organisms β from human to mouse to tomato plant. 25 TB of unique data. 337 cell types, 296 diseases, spanning almost every major tissue.
Ontology-aligned gene names, cell types, tissues, and species, consistently across the entire corpus.
2/3
The biggest bottleneck in building cell foundation models isn't the architecture. It's the data.
For scPRINT-2 we assembled what is, to our knowledge, the largest pre-training corpus for any cell foundation model. www.biorxiv.org/cont... π§΅
1/3
Instead of attending to every pair of positions, you make two lightweight passes β one along rows, one along columns.
I Just wrote up a small blogpost about it: π jkobject.com/criss-c...
Would love to hear from anyone exploring efficient attention mechanisms π
#Transformers #Attention
2/2
Self-attention changed everything in deep learning. But it comes with a tax: O(nΒ²) complexity. For long sequences, that's not just slow β it's a wall.
There's a cleaner way to think about it, which I introduced in my recent preprint: scPRINT-2, it is called Criss-Cross Attention: π§΅
1/2
We prefer some people to get cancer, MS, parkinson than to give a virus to people that will get the virus anyway. Many people might volunteer! Indeed, we give so much to associations against cancer, MS, and dementia, but when it comes time to do something about it, it seems no one wants to.
6/6
β Nowadays, no regulatory agency, even less in Europe, would let you do that.
5/6
You could recruit kids, give them the vaccine, and infect them with EBV directly, since you know that almost all of them will be at some point, then check if they get infected or not using sequencing (PCR tests, B-cell antigen sequencing), and accept this as the endpoint of the trial.
4/6
Because you need to recruit tens of thousands of young kids, test them often for infection, and wait decades to see symptoms of other diseases appear in some of them. The statistics are terrible.
But it could be cheap...
3/6
From different cancers, skin diseases, dementia, parkinson and more.
The reason why there is no vaccine yet in 2026 is very interesting.
Basically, it is super expensive. But why is it so?
2/6
Did you know that likely most cases of multiple sclerosis (MS) are driven by the EBV virus (herpes/mononucleosis disease)?
>90% of us get infected in our teens, and some will go on to develop many diseases later in life because of it.
1/6
And then lucky to pursue through an atlas of cells of many types and species, with a focus on quality and diversity mattering more than quantity with @jkobject.com
@cantinilab.bsky.social
Paper: www.biorxiv.org/cont... β’ Code: github.com/cantinila...
Curious: **whatβs the one benchmark you wish every single-cell foundation model reported by default?**
6/6
4. **Generalization:** evaluation on **unseen organisms, tasks, and modalities.** It is also a push to rethink some evaluation of scFM; **SOTA on many tasks**. π₯Β πΒ β·οΈΒ βΉοΈββοΈ
π If youβre reading papers over the break, I hope this is useful.
5/6
3. **Data + pipeline:** unified **scBaseCount + Tahoe-100M + CELLxGENE**, with consistent preprocessing + weighted random sampling ****(and other practical bits that usually stay hidden) β **350M cells, 16 species, ~300 tissues, ~500 cell types**. ππ«π
4/6
1. **Benchmark:** **42 components** of scFMs across a gymnasium of tasks; looking at dataset size, encoding, training, architectures, losses, etc. π
2. **Model:** **scPRINT-2** β *small but mighty* with **~20M active parameters**, built from the strongest ingredients we found. π€π§¬
3/6
After a few years building scFMs (scPRINT, Xpressor, scPRINT-2β¦), I wanted to do something more βcompleteβ than just shipping a new model: understand what matters, train the best version we can, and stress-test generalization properly.
So this work is a 4-in-1 release:
2/6
π§βππ Christmas Foundation Model Release: scPRINT-2
**One-liner:** a **20M-active-param** single-cell foundation model trained on **350M cells / 16 species / 300 tissues / 500 cell types**.
1/6
Thanks to Future4Care, TimothΓ© Cynober, Whitelab Genomics, and Scienta Lab for organizing the event, and to Matteo Marengo, Clara Brouaux, and Gabriel Michaux for helping me manage the round table.
And thanks to my all-star panel: Yann Fleureau, Jeremy Besnard, Sofia Dahoune, and Steven Jerome
It was a blast hosting our Nucleate Inside AI roundtable at the France Techbio 2025 event.
Without double-talk and with amazing panelistsπ§βπ¬:
- Yann Fleureau, CEO, Blossom Life Sci & Founder of Cardiologs
- Steven Jerome, Director, Lead of Hit Discovery, SchrΓΆdinger
- JΓ©rΓ©my Besnard, Advisor, InFocusTx & Co-founder of Exsciencia
- Sofia Dahoune, Partner at Daphni
2/3
ππ§¬I am excited to present you a round table I am doing together with Matteo Marengo Gabriel Michaux as part of our emerging Nucleate Parisian chapter led by Clara Brouaux π₯.
Title: **Inside AI: Choosing the Right Path to Value Creation**
1/3
Next week we will see the first conference where both the main authors and reviewers are LLM Agents!
This might be fun to follow: agents4science.stanf...
πΒ π€
I am presenting my PhD work today at the conference on immuno oncology in Toulouse's CRCT Oncopole!
Happy to talk about how we can use foundation models in the real world π§¬Β π§ββοΈ
π· Alnylam BioVenture Challenge β one day at Alnylam HQ, one shot at $100K in non-dilutive funding. Apply by Oct 17.
And β weβre also recruiting the next generation of Nucleate Leaders. If youβre ready to build biotech and strengthen the community behind it, apply today.
Itβs about growth, collaboration, and the chance to give back by lifting others.
Two flagship opportunities are now open:
π· Activator 2026
β our equity-free accelerator equipping scientific founders with the tools to launch biotech ventures. Apply by Oct 20.