Advertisement ยท 728 ร— 90

Posts by Jeff Rasley

Preview
Post-train DeepSeek V3/R1 with DPO using just a few GPU nodes? ยท snowflakedb ArcticTraining ยท Discussion #58 Hello AI Community! We are pondering over the features we can bring to ArcticTraining in the near future that would offer value to the AI community. One such feature we are considering is the abili...

Do you want the ability to post-train DeepSeek V3/R1 models with DPO using just a few GPU nodes?

Please vote here and share your feelings: github.com/snowflakedb/...

This would be built into ArcticTraining, an open-source, easy to use post-training framework built on top of DeepSpeed.

1 year ago 1 0 0 0
Preview
SwiftKV Cuts LLM Inference Costs by 75% with Snowflake Cortex AI SwiftKV optimizes Meta Llama LLMs on Snowflake Cortex AI, reducing inference costs by up to 75% while maintaining accuracy for enterprise AI solutions.

We release three initial Trainers for ArcticTraining:
1. Supervised Fine-Tuning (SFT)
2. SwiftKV trainer to coincide with our model releases: snowflake.com/en/blog/up-t...
3. Speculative decoder trainer

1 year ago 0 0 0 0
Preview
ArcticTraining: Simplifying and Accelerating Post-Training for LLMs ArcticTraining, a streamlined framework for LLM post-training, offering flexible trainers, simplified structures, and native data generation pipeline.

๐Ÿš€ Super proud to share ArcticTraining, an open-source post-training framework to simplify and power new research directions!
โœ… Modular trainers for fast prototyping
โœ… Simple callback system for easy customization
โœ… Native data generation pipelines
www.snowflake.com/en/engineeri...

1 year ago 1 0 1 0