Advertisement · 728 × 90

Posts by David Hall

I think a lot of federal money is tied to accreditation like Pell grants and research funds and stuff. So while Harvard has lots of money in the endowment, it would still be a pretty big hit to the budget.

9 months ago 6 0 1 0

Many thanks to the Google TPU Research Cloud program for providing the much needed compute for this project, and to all the other great open efforts: @ai2.bsky.social @eleutherai.bsky.social and more!

11 months ago 2 0 0 0
Introducing Marin: An Open Lab for Building Foundation Models Open-source software is a success story: It powers the world’s digital infrastructure. It allows anyone in the world to contribute based on merit. It leads to greater innovation, collaboration, and se...

You can read more in our:

- Website: marin.community
- GitHub: github.com/marin-commun...
- Discord: discord.gg/J9CTk7pqcM
- Documentation: marin.readthedocs.io
- Announcement: marin.community/blog/2025/05/1

11 months ago 1 0 1 0
Explanation of data shop: prompt or sample data comes in, llm finds more data, train a cheap model to find even more, train, --> llm

Explanation of data shop: prompt or sample data comes in, llm finds more data, train a cheap model to find even more, train, --> llm

Have a specific use case? Come to our Datashop to curate data and train models.
Here’s how we curated more math data:
github.com/marin-commun...
Check out the data:
marin.community/data-browser/

11 months ago 1 0 1 0
pareto frontier of flops vs bits-per-byte

pareto frontier of flops vs bits-per-byte

Have a new algorithm for training? Choose your compute budget and get on the speedrun leaderboard: how fast can you drive down validation loss?
marin.community/speedrun/

11 months ago 0 0 1 0
Flowchart shoing Github issue (preregistration) -> pull request (experiment.py)  -> execution (watch it live) -> WandB report (analysis)

Flowchart shoing Github issue (preregistration) -> pull request (experiment.py) -> execution (watch it live) -> WandB report (analysis)

Marin (marin.community) repurposes GitHub, which has been successful for open-source *software*, for AI:
1. Preregister an experiment as a GitHub issue
2. Submit a PR, which implements the experiment in code
3. PR is reviewed by experts in the community
4. Watch the execution of the experiment live!

11 months ago 0 0 1 0
open weights vs open source (weights + code + recipe) vs open development (+ process, anyone can contribute)

open weights vs open source (weights + code + recipe) vs open development (+ process, anyone can contribute)

Marin is a new "open lab" for developing foundation models. More than open weights, and even open source, with Marin we're committing to "open development": everything is documented and traceable, and anyone can contribute.

11 months ago 1 0 1 0
Introducing Marin: An Open Lab for Building Foundation Models Open-source software is a success story: It powers the world’s digital infrastructure. It allows anyone in the world to contribute based on merit. It leads to greater innovation, collaboration, and se...

Learn more about the project in Percy's blog post: marin.community/blog/2025/05...

And about the Models we are releasing in @dlwh.bsky.social's training retro: marin.readthedocs.io/en/latest/re...

11 months ago 0 1 0 0
Advertisement
Preview
Percy Liang on X: "What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3" / X What would truly open-source AI look like? Not just open weights, open code/data, but *open development*, where the entire research and development process is public *and* anyone can contribute. We built Marin, an open lab, to fulfill this vision: https://t.co/racsvmhyA3

Super excited Marin is finally out! Come see what we've been building! Code/platform for training fully reproducible models end-to-end, from data to evals. Plus a new high quality 8B base model. Percy did a good job explaining it on the other place. marin.community

x.com/percyliang/s...

11 months ago 20 6 1 0