🔗Recipe: huggingface.co/learn/cookbo...
🔗Original blog by Edward Beeching, @lewtun.bsky.social and @srushnlp.bsky.social from @hf.co: huggingface.co/spaces/Huggi...
Thanks @stevhliu.hf.co and @lewtun.bsky.social for the feedback 🙏
Posts by Sergio Paniego
Scaling test-time compute with open models diagram
🧠 Following Hugging Face's blog on scaling test-time compute with open models—letting models "think longer," inspired by OpenAI & DeepMind—I created a recipe to extend inference time for Instruct LLMs, tackling harder tasks like complex math problems.
Links below 👇
Here’s what’s included:
📷 SmolVLM (VLM) by @hf.co
🔧 SFT & DPO fine-tuning methods
⚙️ Runs on consumer GPUs
🔗SFT project: huggingface.co/learn/cookbo...
🔗DPO project: huggingface.co/learn/cookbo...
🙏 @stevhliu.hf.co & @merve.bsky.social & @benburtenshaw.bsky.social
I’m a big fan of smol models—compact, efficient, and perfect for inference/training on limited resources. Even better when they’re multimodal! 🤏✨
I explored fine-tuning SmolVLM, a multimodal smol model using TRL with SFT and DPO, creating 2 hands-on projects!
🔗Links below👇
💡I've been exploring how to go smol with multimodal RAG.
I've created a project using SmolVLM and ColSmolVLM to create a multimodal RAG that can run on Colab's free tier.
Featuring:
🤏👀 SmolVLM (VLM)
🤏📚ColQwen2 (Doc Retrieval)
⚙️ Runs in Colab's free-tier GPU
Link below
💡 New Multimodal RAG Recipe with Re-Ranking 💡
I explored how to enhance a multimodal RAG pipeline by integrating a re-ranker!
Featuring:
✨ Qwen2-VL-7B (VLM)
📚 ColQwen2 (Doc Retrieval)
🔍 MonoQwen2 (Re-ranking)
🔥 Optimized for consumer GPUs with quantized VLMs.
Link below:
screenshot of the notebook in the link
Learn how to build a complete multimodal RAG pipeline with
ColQwen2 as retriever, MonoQwen2-VL as reranker, Qwen2-VL as VLM in this notebook that runs on a GPU as small as L4 🔥 huggingface.co/learn/cookbo...
✨ Gave a talk on autonomous driving today to undergrad students! We covered everything from definitions to real-world examples, plus cutting-edge concepts like Generative World Models and Vision-Language Models (VLMs). Exciting future ahead! 🚗💡
This is such a cool project, and it was a truly exciting experience to contribute to it!! 😀
We took those TRL notebooks from last week and made a page from them. So if you're upskilling on finetuning or aligning LLMs, and want examples from the community (like Maxime Labonne Philipp Schmid Sergio Paniego Blanco), check it out!
bsky.app/profile/benb...
>> huggingface.co/docs/trl/mai...
Thanks to @arig23498.bsky.social, @pcuenq.hf.co, and @reach-vb.hf.co for the collaboration. It's a pleasure working with such talented individuals! 🚀
I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:
1️⃣ Understanding tool calling with Llama 3.2 🔧
2️⃣ Using Text Generation Inference (TGI) with Llama models 🦙
(links in the next post)
🔗 Link to the blog post: weaviate.io/blog/what-is... (by Erika Cardenas, @iamleonie.bsky.social)
🔗 Link to the recipe: huggingface.co/learn/cookbo...
🤗 Huge thanks to Aymeric Roucher and @stevhliu.hf.co for their support and insights!
In this notebook, I use Qwen2.5-72B-Instruct as the LLM to build a system with:
1️⃣ A manager agent
2️⃣ Three specialized agents: retriever, web search, and image generation
🧑🍳 The result is this new Hugging Face Cookbook recipe, where I demonstrate how to create a Multi-Agent RAG system leveraging the agent support from the transformers module.
💡 A few days ago, I came across a fascinating post about Agentic RAG by Erika Cardenas and Leonie Monigatti, and it inspired me to dive into the concept and bring it to life in code!
4/6 More vision skills for complex visual tasks. This tutorial shows how to fine-tune the Qwen2-VL-7B model for visual question answering using the ChartQA dataset.
huggingface.co/learn/cookbo...
by @sergiopaniego.bsky.social
TRL is a cornerstone of LLM post training and imo it's the default to learn.
There are great alternatives like Unsloth, Axolotl, and AutoTrain. But if you want a daily drive that does experimentation to production, it's TRL.
🧵 these community notebooks guide you through TRL's core:
hola 👋 hi 👋