Posts by vb
Qwen released QvQ 72B OpenAI o1 like reasoning model on Hugging Face with Vision capabilities - beating GPT4o, Claude Sonnet 3.5 🔥
3.1 70B vs 3.3 70B:
Code Generation
> HumanEval: 80.5% → 88.4% (+7.9%)
> MBPP EvalPlus: 86.0% → 87.6% (+1.6%)
Steerability
> IFEval: 87.5% → 92.1% (+4.6%)
Reasoning & Math
> GPQA Diamond (CoT): 48.0% → 50.5% (+2.5%)
> MATH (CoT): 68.0% → 77.0% (+9%)
Llama 3.3 70B vs 405B:
> GPQA Diamond (CoT): 50.5% vs 49.0%
> Math (CoT): 77.0% vs 73.8%
> Steerability (IFEval): 92.1% vs 88.6%
huggingface.co/meta-llama/L...
BOOOOM! Meta released Llama 3.3 70B - 128K context, multilingual, enhanced tool calling, outperforms Llama 3.1 70B and comparable to Llama 405B 🔥
Comparable performance to 405B with 6x LESSER parameters ⚡
Introducing Indic-Parler TTS - Trained on 10K hours of data, 938M params, supports 20 Indic languages, emotional synthesis, apache 2.0 licensed! 🔥
w/ fully customisable speech and voice personas!
Try it out directly below or use the model weights as you want!
🇮🇳/acc
you can just do things - ask AI to create your SQL queries and execute them right in your browser! 🔥
let your creativity guide you - powered by qwen 2.5 coder 32b ⚡
available on all 254,746 public datasets on the hub!
go check it out today! 🤗
This demo of structured data extraction running on an LLM that executes entirely in the browser (Chrome only for the moment since it uses WebGPU) is amazing
My notes here: simonwillison.net/2024/Nov/29/...
To showcase how much you can do with just a 1.7B LLM, you pass free text, define a schema of parsing the text into a GitHub issue (title, description, categories, tags, etc) - Let MLC & XGrammar do the rest!
That's it, the code is super readable, try it out today! 🤗
huggingface.co/spaces/reach...
Fuck it! Structured Generation w/ SmolLM2 running in browser & WebGPU 🔥
Powered by MLC Web-LLM & XGrammar ⚡
Define a JSON schema, Input free text, get structured data right in your browser - profit!!
FYI, here's the entire code to create a dataset of every single bsky message in real time:
```
from atproto import *
def f(m): print(m.header, parse_subscribe_repos_message())
FirehoseSubscribeReposClient().start(f)
```
I have converted a portion of my NLP Online Masters course to blog form. This is the progression I present that takes one from recurrent neural network to seq2seq with attention to transformer. mark-riedl.medium.com/transformers...
I'm disheartened by how toxic and violent some responses were here.
There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.
> uses 90% sliding window and 10% global attention for efficiency
> 2-stage pre-training and 3-phase post-training, including a trapezoid learning rate schedule
try it out on hugging face today! 🤗
huggingface.co/collections/...
yo! nvidia finally released the weights for Hymba-1.5B - outperforms Qwen, and SmolLM2 w/ 6-12x less training
trained ONLY on 1.5T tokens
> massive reductions in KV cache size and improved throughput
> combines Mamba and Attention in a hybrid parallel architecture with a 5:1 ratio and meta-tokens
Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.
SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!
Model weights on the hub, you can even run this on a Raspberry Pi! Go run, inference now! 🐐
huggingface.co/OuteAI/OuteT...
Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥
> Multilingual - English, Chinese, Korean & Japanese
> Cross platform inference w/ llama.cpp
> Trained on 5 Billion audio tokens
> Qwen 2.5 0.5B LLM backbone
> Trained via HF GPU grants
💯
🐐
It depends on what you define long context; I'm fairly confident up until 64K and moderately till 128K, beyond that - I've personally never tested.
Most of my observations are based on chat use-cases.
Yeah! @loubnabnl.hf.co & @eliebak.bsky.social are 🐐