Dio (@interleave.love) Bsky

Deplorable.

14 hours ago 1 0 0 0

I give it six months

19 hours ago 1 0 0 0

Yes. The setup does start to make sense if you want to run Kimi or GLM 5.1 though.

1 day ago 0 0 0 0

Technically but with barely any context window

1 day ago 0 0 1 0

Extra two*

1 day ago 0 0 0 0

I want you to consider cost in this discussion. It doesn't seem like you are. We are talking about spending cash for an extra three dgx sparks when you could realistically run minimax on two. So for half the cost you slightly worsen tool calling and get a bit worse output. Boo hoo

1 day ago 0 0 2 0

Yes it is. But accuracy loss is less than 0.25% for q8 from bf16. From q8 to Nvfp4 it's about 1% loss.

1 day ago 0 0 1 0

If you use an Nvfp4 quant you should barely lose precision vs q8. You gain way more speed. As for parallel requests, that may be a benefit legitimately. Idk if it's worth it for 20 bands though.

1 day ago 0 0 2 0

Guys this is why you should do some research. You can just run CPT on a model that's post trained pretty easily. And you don't need to save and then reapply the weights although you can. You can sample instruct tuned data during CPT. You can also do this with a LoRA which is even cheaper.

1 day ago 0 0 0 0

Overkill for just running minimax. Tuning a quant could have cost him less than half that and have faster speeds with barely any decrease in output.

1 day ago 0 0 1 0

1 day ago 0 0 0 0

I could have been early to builder Sunday. Instead, I decided to take the scenic route. Literally. I decided to take some pictures of my way as I walked there.

1 day ago 0 0 1 0

Time to sleep and get ready for builder Sunday.

2 days ago 0 0 0 0

I still have to finish my email setup. I still have to get an apple account for the app store. I still have to do business registration. It's tough. I think I can do that this month but it is tough.

2 days ago 0 0 0 0

Interleave - The Relationship Coach in your Pocket The platform that helps you nurture what grows between you.

The road to mobile deployment for interleave.love is long. I said I would do this about two months ago and I still don't have an app on either store. Hoping to do that this month. But there's a ton of bugs and testing I have to do for web and then for mobile after.

2 days ago 0 0 1 0

Oof Keating has a bug I am trying to resolve with the webui. Will see if I can get that fixed.

2 days ago 0 0 0 0

How to (Anti) AI Better YouTube video by Dr. Fatima

I think this is the start of improving #discourse on The Left® around LLMs. Still it's a bit outdated given what's currently available but it's improved a lot from the other takes.

2 days ago 0 0 0 0

My walk yesterday

2 days ago 0 0 0 0

Me too.

2 days ago 1 0 0 0

I've already used it to deploy and develop a bunch of stuff. Including my own personal site, Keating, Axon and den's site itself.

2 days ago 0 0 0 0

I'm calling it den but if you guys have better ideas tell me. It's supposed to be the brainless way to allow your agents to deploy to production and work in development. It's reproducible and uses either Nix or Guix with Dhall to have declarative configuration for your project.

2 days ago 0 0 1 0

I've been cooking recently @fly.io been playing with sprites a ton. Wrapped it up in a tool I'm calling den which allows you to move between @railway.com and sprites. Preferably sprites for development and railway for final deployment. Thinking of adding CloudFlare as well. Not sure yet.

2 days ago 0 0 0 1

It can't if they are charging for it.

2 days ago 1 0 1 0

Why

2 days ago 0 0 0 0

No commercial license! 😔

2 days ago 2 0 1 0

I was so excited for the release. This is beyond frustrating to me.

2 days ago 1 0 0 0

You see I tend to prefer to use the library for a thing. The library is cool but I kind of want to see how I could combine things and use it to make stuff I thought of after thinking about what it all does.

3 days ago 1 0 1 0

Just to be clear, this does come with a trade-off per bit. It is less accurate than turboquant. It's about one bid. Less accurate overall. But you get huge speed - ups In comparison.

3 days ago 1 0 0 0

RotorQuant — Clifford Algebra Vector Quantization | Scrya A reimagining of Google's TurboQuant using Clifford rotors. 10-19x faster, 44x fewer parameters, matching attention fidelity on real LLMs.

So Turbo Quant gets thrashed already. The speed ups are getting kind of crazy guys. This makes using Gema 4 31b sensible for even M1 Mac with 48 GBs of RAM at Q4

3 days ago 2 0 2 0

There is now an autoteacher. It's called Keating. From the movie of dead poet's society the movie. keating.help

3 days ago 0 0 0 0

Posts by Dio