Deplorable.
Posts by Dio
I give it six months
Yes. The setup does start to make sense if you want to run Kimi or GLM 5.1 though.
Technically but with barely any context window
Extra two*
I want you to consider cost in this discussion. It doesn't seem like you are. We are talking about spending cash for an extra three dgx sparks when you could realistically run minimax on two. So for half the cost you slightly worsen tool calling and get a bit worse output. Boo hoo
Yes it is. But accuracy loss is less than 0.25% for q8 from bf16. From q8 to Nvfp4 it's about 1% loss.
If you use an Nvfp4 quant you should barely lose precision vs q8. You gain way more speed. As for parallel requests, that may be a benefit legitimately. Idk if it's worth it for 20 bands though.
Guys this is why you should do some research. You can just run CPT on a model that's post trained pretty easily. And you don't need to save and then reapply the weights although you can. You can sample instruct tuned data during CPT. You can also do this with a LoRA which is even cheaper.
Overkill for just running minimax. Tuning a quant could have cost him less than half that and have faster speeds with barely any decrease in output.
I could have been early to builder Sunday. Instead, I decided to take the scenic route. Literally. I decided to take some pictures of my way as I walked there.
Time to sleep and get ready for builder Sunday.
I still have to finish my email setup. I still have to get an apple account for the app store. I still have to do business registration. It's tough. I think I can do that this month but it is tough.
The road to mobile deployment for interleave.love is long. I said I would do this about two months ago and I still don't have an app on either store. Hoping to do that this month. But there's a ton of bugs and testing I have to do for web and then for mobile after.
Oof Keating has a bug I am trying to resolve with the webui. Will see if I can get that fixed.
I think this is the start of improving #discourse on The Left® around LLMs. Still it's a bit outdated given what's currently available but it's improved a lot from the other takes.
My walk yesterday
Me too.
I've already used it to deploy and develop a bunch of stuff. Including my own personal site, Keating, Axon and den's site itself.
I'm calling it den but if you guys have better ideas tell me. It's supposed to be the brainless way to allow your agents to deploy to production and work in development. It's reproducible and uses either Nix or Guix with Dhall to have declarative configuration for your project.
I've been cooking recently @fly.io been playing with sprites a ton. Wrapped it up in a tool I'm calling den which allows you to move between @railway.com and sprites. Preferably sprites for development and railway for final deployment. Thinking of adding CloudFlare as well. Not sure yet.
It can't if they are charging for it.
Why
No commercial license! 😔
I was so excited for the release. This is beyond frustrating to me.
You see I tend to prefer to use the library for a thing. The library is cool but I kind of want to see how I could combine things and use it to make stuff I thought of after thinking about what it all does.
Just to be clear, this does come with a trade-off per bit. It is less accurate than turboquant. It's about one bid. Less accurate overall. But you get huge speed - ups In comparison.
So Turbo Quant gets thrashed already. The speed ups are getting kind of crazy guys. This makes using Gema 4 31b sensible for even M1 Mac with 48 GBs of RAM at Q4
There is now an autoteacher. It's called Keating. From the movie of dead poet's society the movie. keating.help