Posts by Marc-Antoine Ruel
I know "cryptography". I understood "TLS".
Oops I posted an old screenshot for ollama by accident, here's the one with gemma4 taken from github.com/maruel/genai...
genai generates automated scoreboard for tens of providers. There's a fair amount of variance!
github.com/maruel/genai...
Both qwen 3.5 2B and gemma4 e2b at Q4_K_M (!) are performing reasonably well for tool calling on llamacpp. It's impressive compared to a few months ago.
ollama is horrible for tool calling as you can't force a tool call, don't use it.
Mythos' cost is 25$/MTok in and 125$/MTok out
> Anthropicβs commitment of $100M in model usage credits to Project Glasswing (...)
That's like, 3 weeks of Mythos usage? :P
genai v0.4.1 is out!
While genai always enabled you to do HTTP replay tests for rock solid testing, you can now do replay tests with claude code, codex and opencode!
This enables seamless headless coding harness testing.
Oh and I added the github models to run LLMs from within github actions.
it's good. I need to start doing my taxes and a few other things
New record: blew up my 5h coding window in 7 minutes via 22 subagents on Claude Max 5x.
the solution is to create a new Google account. that's annoying for hobby projects
I feel you. It's always annoying when I make a mistake and the blast radius is largely increased because the algo thinks "engagement is good".
caic v0.6.0 now supports task forking. It's not the forking functionality on claude code, codex, or opencode. It takes a snapshot of the whole docker container and starts a fresh coding session. You can even map new repos in if you realize you do complex cross-repo work.
that's the way
@crawshaw.io figured out how to create engagement on twitter. :)
md v0.12.0 is out! You can now run `md fork` to fork a live running container to run a different harness, test different things.
It's like "Is [ex-colleague] in the top 5% of the people you worked with." referral question.
It's a tricky question to answer when "people I worked with" includes two programming language creators, people who created groundbreaking systems, and the likes.
I had a colleague thinking this during performance review at a large corporation and he would single handedly sink all his colleagues scores.
It's important to understand the context.
The director managing Jules and Gemini CLI went to OpenAI to work on codex.
I guess she understood we won't get a good coding Gemini for a while. I'm a bit sad about that.
from his website
Happy March 31st for investors, aka the day I finally get my tax forms from some organizations that wait up to the very last minute before sending them.
see their whole analysis at x.com/i/status/203...
it's interesting
TIL: Sam Altman celebrated Keith Rabois' mariage
availability bias is thinking the best engineers are active on social networks
An excellent read if you care about performance. The notes show they know what they are doing.
blog.ydb.tech/how-io-uring...
My contrarian take is that it's worth paying taxes to keep a functioning society.
I'm impressed how consistently partisan is Build Canada's newsletter, especially for a "non partisan NGO"
GMail's filter is such a missed UX opportunity. It's so powerful yet so not approachable.