Jeff Rasley (@jeffra) Bsky

Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? · snowflakedb ArcticTraining · Discussion #58 Hello AI Community! We are pondering over the features we can bring to ArcticTraining in the near future that would offer value to the AI community. One such feature we are considering is the abili...

Do you want the ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?

Please vote here and share your feelings: github.com/snowflakedb/...

This would be built into ArcticTraining, an open-source, easy to use post-training framework built on top of DeepSpeed.

1 year ago 1 0 0 0

SwiftKV Cuts LLM Inference Costs by 75% with Snowflake Cortex AI SwiftKV optimizes Meta Llama LLMs on Snowflake Cortex AI, reducing inference costs by up to 75% while maintaining accuracy for enterprise AI solutions.

We release three initial Trainers for ArcticTraining:
1. Supervised Fine-Tuning (SFT)
2. SwiftKV trainer to coincide with our model releases: snowflake.com/en/blog/up-t...
3. Speculative decoder trainer

1 year ago 0 0 0 0

ArcticTraining: Simplifying and Accelerating Post-Training for LLMs ArcticTraining, a streamlined framework for LLM post-training, offering flexible trainers, simplified structures, and native data generation pipeline.

🚀 Super proud to share ArcticTraining, an open-source post-training framework to simplify and power new research directions!
✅ Modular trainers for fast prototyping
✅ Simple callback system for easy customization
✅ Native data generation pipelines
www.snowflake.com/en/engineeri...

1 year ago 1 0 1 0

Posts by Jeff Rasley