Advertisement · 728 × 90

Posts by Aritra Roy Gosthipaty

5/ Each of the tech mentioned above has its own pros and cons. The processor that you are running in your system (a phone, a laptop, etc) will have a weighted sum of all the above.

It baffles me to think about all of this. 🤗

1 year ago 0 0 0 0

4/N Multi Threading in Single Core

In a core, we can have multiple register blocks (context blocks) to access different instructions. This way if a process is stalled, the processor quickly jumps to another.

1 year ago 0 0 1 0

3/N SIMD paradigm

In a single core if we have duplicate ALUs we can operate of a bunch of data in a single clock tick. The catch? Each operation should be the same.

Single instruction Multiple Data

1 year ago 0 0 1 0

2/N Multi Core Processor:

A single processor consists of a control unit, arithmetic unit and some registers. How about we duplicate this block into multiple blocks? This is the multi-core architecture. As a programmer you would need to explicitly mention which code runs where.

1 year ago 0 0 1 0

1/N Superscalar processors:

Your program is a list of instructions. This list almost always has independent instructions. A superscalar processor would identify them and execute seperately in the same clock tick.

1 year ago 0 0 1 0

Some pointers on parallel computing:

A small thread 🧵👇

1 year ago 0 0 1 0
Preview
SigLIP2 - a google Collection We’re on a journey to advance and democratize artificial intelligence through open source and open science.

HF model collection for transformers:
huggingface.co/collections/...

HF model collection for OpenCLIP and timm:
huggingface.co/collections/...

And of course big_vision checkpoints:
github.com/google-resea...

1 year ago 2 1 0 0
Preview
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features We introduce SigLIP 2, a family of new multilingual vision-language encoders that build on the success of the original SigLIP. In this second iteration, we extend the original image-text training obje...

Paper:
arxiv.org/abs/2502.14786

HF blog post from @arig23498.bsky.social et al. with a gentle intro to the training recipe and a demo:
huggingface.co/blog/siglip2

Thread with results overview from Xiaohua (only on X, sorry - these are all in the paper):
x.com/XiaohuaZhai/...

1 year ago 1 1 1 0
Advertisement
Post image

📢2⃣ Yesterday we released SigLIP 2!

TL;DR: Improved high-level semantics, localization, dense features, and multilingual capabilities via drop-in replacement for v1.

Bonus: Variants supporting native aspect and variable sequence length.

A thread with interesting resources👇

1 year ago 13 1 1 0
Preview
🚀 Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! A Blog post by Aritra Roy Gosthipaty on Hugging Face

Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! by @arig23498.bsky.social

Build a proof-of-concept API, hosting Qwen2.5-VL-7B-Instruct on Hugging Face Spaces using Docker.

huggingface.co/blog/ariG234...

1 year ago 5 1 0 0
Preview
Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/logits-...

1 year ago 7 2 0 0
Preview
Models - Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I forgot to mention that you can use the same code to access any `warm` model on the Hub.

Here is a list of all the `warm` models: huggingface.co/models?infer...

Happy vibe checking 😇

[N/N]

1 year ago 2 0 0 0
Preview
qwq-inference-api.ipynb · ariG23498/quick-notebooks at main We’re on a journey to advance and democratize artificial intelligence through open source and open science.

I have created a simple and quick notebook to access this inference api and use `huggingface_hub` to access the model through it.

huggingface.co/datasets/ari...

[4/N]

1 year ago 3 0 1 0
Post image

But today it was my lucky day. I noticed that the model was already loaded on the Serverless Inference API and was ready to be used.

No more spinning up my GPUs and stress testing them (happy GPU noises)

[3/N]

1 year ago 0 0 1 0

My usual workflow is to visit the Hugging Face Hub model card (here that was hf[dot]co[dot]Qwen/QwQ-32B-Preview) and copy the working code sample.

I am sure this is how most of you work with a new model as well (if not, I would love to hear from you)

[2/N]

1 year ago 0 0 1 0
Advertisement

The Qwen team is doing so much for the community by keeping research open and constructive.

They listen to the community and put efforts in building competitive models.

I was intrigued by their latest `Qwen/QwQ-32B-Preview` model and wanted to play with it.

[1/N]

1 year ago 10 1 1 0
Post image

I've been exploring the latest Llama 3.2 releases and working on a couple of projects you may find interesting:

1️⃣ Understanding tool calling with Llama 3.2 🔧
2️⃣ Using Text Generation Inference (TGI) with Llama models 🦙

(links in the next post)

1 year ago 12 3 1 0

I like the evaluation part. Is there some evals you particularly like?

1 year ago 0 0 1 0

What is THE pain point in training Vision Language Models according to you?

I will go first, the data pipeline.

1 year ago 1 0 3 0

🙋‍♂️ ariG23498

1 year ago 2 0 0 0
Preview
Adding support for Qwen model by ariG23498 · Pull Request #3 · sayakpaul/simple-image-recaptioning A working colab notebook

Re-caption your webdataset with Qwen2-VL

github.com/sayakpaul/si...

1 year ago 14 0 0 0
Preview
Faster Text Generation with Self-Speculative Decoding We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/layerskip

1 year ago 8 3 0 0
Post image

To the video generation enthusiats, Mochi 1 Preview is now supported in `diffusers`

1 year ago 6 0 0 0

awesome, thanks a lot for sharing 🙌

1 year ago 1 1 0 0
Post image

`bitsandbytes` makes it really easy to quantize models

Note: MB should be GB in the diagram.

1 year ago 7 1 1 0
Advertisement
Post image

Read about the Qwen2.5-Coder Series

huggingface.co/blog/ariG234...

1 year ago 1 0 0 0

Training ranking models for better retrieval from stores is GOD level thinking.

1 year ago 0 0 0 0

I am diving head first into Vision Language Models. Comment below the papers that I definitely should read.

1 year ago 2 0 0 0
Preview
Hugging Face + PyCharm We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Welcome the @huggingface.bsky.social integration in PyCharm. From instant model cards to navigating the local cache, working with Hugging Face models becomes a lot easier with PyCharm.

Bonus: Claim a 3 month PyCharm subscription using PyCharm4HF

Blog Post: huggingface.co/blog/pycharm...

1 year ago 0 0 0 0