Do you want the ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?
Please vote here and share your feelings: github.com/snowflakedb/...
This would be built into ArcticTraining, an open-source, easy to use post-training framework built on top of DeepSpeed.
Posts by Jeff Rasley
1 year ago
1
0
0
0
We release three initial Trainers for ArcticTraining:
1. Supervised Fine-Tuning (SFT)
2. SwiftKV trainer to coincide with our model releases: snowflake.com/en/blog/up-t...
3. Speculative decoder trainer
1 year ago
0
0
0
0
๐ Super proud to share ArcticTraining, an open-source post-training framework to simplify and power new research directions!
โ
Modular trainers for fast prototyping
โ
Simple callback system for easy customization
โ
Native data generation pipelines
www.snowflake.com/en/engineeri...
1 year ago
1
0
1
0