Wing Lian (@winglian) Bsky

GitHub - axolotl-ai-cloud/grpo_code Contribute to axolotl-ai-cloud/grpo_code development by creating an account on GitHub.

GitHub: github.com/axolotl-ai-c...
Model 🤗🤖: huggingface.co/axolotl-ai-c...

1 year ago 1 0 1 0

Training Large Language Models with Interpreter Feedback using WebAssembly A Blog post by Axolotl AI on Hugging Face

The sandbox uses WebAssembly + Python multiprocessing to safely execute model-generated code in parallel, fully locally. This enables scalable, automated reward signals for GRPO fine-tuning without the complexity of Docker or remote eval infra.

Blog Post 🤗: hf.co/blog/axolotl...
🧵(2/3)

1 year ago 2 0 1 0

We've implemented a simple toolkit for fine-tuning powerful coding models using only RL with an entirely local, zero-setup sandboxed code interpreter. We found very promising results using a tiny fraction of data & training time vs SFT. Check out our blogpost for more details! 👇
🧵(1/3)

1 year ago 3 1 1 0

Some of my fav engineers / researchers in AI/ML.

If you’re a cracked eng / researcher I missed please comment and I’ll add you! 🦋

go.bsky.app/H9nj9nJ

1 year ago 28 5 9 2

Needs backwards kernels so we can use it for finetuning 😅

1 year ago 1 0 0 0

Posts by Wing Lian