Advertisement Β· 728 Γ— 90

Posts by Erik

Post image

We have Nvidia B200s ready to go for you in Hugging Face Inference Endpoints πŸ”₯

I tried them out myself and the performance is amazing.

On top of that we just got a fresh batch of H100s as well. At $4.5/hour it's a clear winner in terms of price/perf compared to the A100.

6 months ago 6 1 0 0
Preview
The New and Fresh analytics in Hugging Face Endpoints We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co/blog/endpoin...

1 year ago 0 0 0 0
Video

We just refreshed πŸ‹ our analytics in @hf.co
endpoints. More info below!

1 year ago 8 3 1 0
Post image

Morning workout at the @hf.co Paris office is imo one of the best perks.

1 year ago 3 0 0 0
Preview
Deploy google/gemma-3-27b-it | Inference Endpoints by Hugging Face Deploy google/gemma-3-27b-it for image-text-to-text in 1 click.

Link to deploy:
endpoints.huggingface.co/huggingface/...

1 year ago 0 0 0 0
Video

Gemma 3 is live πŸ”₯

You can deploy it from endpoints directly with an optimally selected hardware and configurations.

Give it a try πŸ‘‡

1 year ago 6 2 1 0
Post image

Apparently, mom is a better engineer than what I am.

1 year ago 4 0 0 0
Advertisement
https://github.com/ErikKaum/bitbubble

today as part of a course, I implemented a program that takes a bit stream like so:

10001001110111101000100111111011

and decodes the intel 8088 assembly from it like:

mov si, bx
mov bx, di

only works on the mov instruction, register to register.

code: github.com/ErikKaum/bit...

1 year ago 1 0 0 0
before you give up, give this video a chance.
before you give up, give this video a chance. YouTube video by Founders, Inc.

Ambition is a paradox.

You should always aim higher, but that easily becomes a state where you're never satisfied. Just reached 10k MRR. Now there's the next goal of 20k.

Sharif has a good talk on this: emotional runway.

How do you deal with this paradox?

video: www.youtube.com/watch?v=zUnQ...

1 year ago 1 0 0 0

There’s some deep wisdom in that as well!

1 year ago 1 0 0 0
Post image

Qui Gon Jinn sharing some insightful prompting wisdom πŸ‘ŒπŸΌ

1 year ago 11 3 1 0

Exactly.

Suppose we have an algorithm that is guaranteed to give output according to a structure, with the caveat that it might run out of tokens.

Should this still be classified as structured generation?

1 year ago 1 0 1 0

πŸ€”

1 year ago 1 0 0 0

CUDA libraries..? So they have access to gpus as well? πŸ‘€

1 year ago 1 0 1 0
Advertisement

A video series on how to develop, profile and compare cuda kernels would be such a banger.

And allow a lot more tinkerers to enter the field.

1 year ago 0 0 0 0

Hell yeah πŸ”₯

How would you classify the edge case when running out of tokens?

E.g if it goes into a ”\n” loop and runs out of tokens.

1 year ago 2 0 1 0

Hah, fair!

1 year ago 1 0 0 0

Interesting, for me it's snappy as hell, maybe things aren't cached as well in Costa Rica? πŸ€”

1 year ago 0 0 1 0

pro tip for the borrow-checker, using .clone() everywhere is okay πŸ™Œ

1 year ago 0 0 1 0
Post image

it's this time of the year 😍

1 year ago 2 0 0 0

Or then you can let the model run free in a constrained environment.

I’m tinkering on this: bsky.app/profile/erik...

1 year ago 2 0 0 0
Post image

Hugging Face inference endpoints now support CPU deployment for llama.cpp πŸš€ πŸš€

Why this is a huge deal? Llama.cpp is well-known for running very well on CPU. If you're running small models like Llama 1B or embedding models, this will definitely save tons of money πŸ’° πŸ’°

1 year ago 25 6 3 1

Nice! This is so neat πŸ™ŒπŸ½

1 year ago 1 0 0 0
Advertisement
Post image

Let's go! We are releasing SmolVLM, a smol 2B VLM built for on-device inference that outperforms all models at similar GPU RAM usage and tokens throughputs.

SmolVLM can be fine-tuned on a Google collab and be run on a laptop! Or process millions of documents with a consumer GPU!

1 year ago 104 22 4 4
Post image

Is it just me or does it intuitively align that chat bars are at the bottom of the page and search bars at the top?

I've noticed that perplexity positions the question on the top and generates the text below.

Is it because they want to position more as a search engine?

1 year ago 1 0 0 0

The hope if have with Bluesky is that I as a user can do moderation more efficiently than what I could on twitter 🀞🏼

1 year ago 12 0 0 0

Feeds and starter packs helped at least me a lot. E.g: bsky.app/profile/did:...

1 year ago 1 0 1 0

Indeed, the beauty of open source πŸ”₯

1 year ago 1 0 0 0

Can’t wait to have that feature!

It’s kinda mind blowing that it’s not a thing on other social media platforms πŸ€·πŸΌβ€β™‚οΈ

1 year ago 1 0 1 0

code boxes with syntax highlighting 😍

1 year ago 4 0 1 0
Advertisement