Quentin Anthony (@quentinanthon15) Bsky

Congrats, Kyle! Well deserved.

10 months ago 1 0 0 0

GitHub - Quentin-Anthony/nanoMPI: Simple MPI implementation for prototyping or learning Simple MPI implementation for prototyping or learning - Quentin-Anthony/nanoMPI

Available at github.com/Quentin-Anth...

Contributions are welcome! I'll be slowly tackling roadmap items ourselves during my offtime.

10 months ago 3 0 0 0

I basically want functional pseudocode that students and self-learners can quickly run and play around with. How does latency increase with message size? How do collective algorithms differ? What’s the effect of warmup? Find out for yourself!

10 months ago 2 0 1 0

nanoMPI’s design is based on OpenMPI. The shortcomings of OpenMPI is that it’s built for a different purpose (modularity and performance), so it’s harder to get quick answers and results when needed compared to the purpose of nanoMPI (clarity and easy installation).

10 months ago 0 0 1 0

I consider nanoMPI to be a companion piece to conceptual MPI material (e.g. you read a description and see a visual of a ring allreduce, but what does this actually look like in code?)

10 months ago 0 0 1 0

nanoMPI serves the dual purpose of:
Providing a minimal implementation for HPC education
Testing distributed code on offline, local machines (I just wanna code on my laptop on a plane, not a remote HPC system)

10 months ago 1 0 1 0

Inspired by “minimal implementation“ projects in AI such as
@karpathy.bsky.social’s nanoGPT, I worked to bring this concept to the HPC world!

I’ve built a minimal implementation of an MPI library called nanoMPI, which focuses on clarity, simplicity, and easy installation.

10 months ago 4 0 1 0

We are the first to demonstrate higher training kernel throughput (both transformers and SSM hybrids) on AMD MI300X compared to H100!

- rocm.blogs.amd.com/ecosystems-a...
- www.zyphra.com/post/trainin...

1 year ago 3 1 1 0

C R A C K E D

1 year ago 1 0 0 0

We dropped the Zamba2 and Zyda2 tech reports on arxiv!
- Zamba2 models of size 1.2B, 2.7B, 7.4B
- Zyda-2 5T token dataset
- We discuss more specifics on model arch, training process, dataset creation, etc

Links:
- Zamba2: arxiv.org/abs/2411.15242
- Zyda-2: arxiv.org/abs/2411.06068

1 year ago 5 0 0 0

Interstellar Official Soundtrack | Full Album – Hans Zimmer | WaterTower YouTube video by WaterTower Music

I keep coming back to interstellar: youtu.be/YF1eYbfbH5k?...

1 year ago 1 0 1 0

Posts by Quentin Anthony