Advertisement ยท 728 ร— 90

Posts by Pyra

Lol

1 day ago 1 0 0 0

if this is your first time hearing about bedrocklinux.org then you're in for an adventure with this one

3 days ago 60 1 6 0
Video

Round Earth v2: with some tweaking the effect becomes rather nice, I love how distant mountains & clouds disappear into the horizon. However, zooming around this map makes me really dizzy somehow ๐Ÿ˜…

#indiedev #gamedev #indiegames #screenshotsaturday

3 days ago 50 7 1 0

I will try it, but I think as long as its not generating less than like 1t/s the main bottleneck would be prompt processing for big system prompts and cold starts for existing conversations.

I'm currently working on patch that saves KV Cache to disk and I think that would help.

1 week ago 1 0 1 0

I mean, it works well with small context, right?

1 week ago 1 0 1 0

Have you tried with TurboQuant enabled?

1 week ago 1 0 2 0

This what happens when you implement correct tests and then ask LLM to fix the bug they were testing and it decides to simplify tests

2 weeks ago 1 0 0 0

Look at this little Qwen box :3

3 weeks ago 1 0 0 0
Advertisement
Post image
3 weeks ago 2 0 2 0

also as for memory management, i have set vram to 4gb of global memory, so llama.cpp falls back to offloading kv cache and weights to ram while still using gpu for compute. As far as i tested there is no performance implications as ram and vram is physically same memory

3 weeks ago 0 0 0 0

This set up hot swappes small models instantly and takes around 35s for bigger ones.

I still need to finalize it and especially what models i run. But Im very happy so far.

3 weeks ago 0 0 0 0

So my set up for Strix Halo so far:

1. Llama.cpp with --no-mmap flag (otherwise big models take forever to load)

2. llama-swap with following config:

It can either run qwen122b q4 (coding) or hermes4 70b (for RP and general conversations)

OR

It runs one or many of: qwen9b, lfm24b.

3 weeks ago 0 0 2 0

I'm gonna try UD-Q5_K_XL :3

4 weeks ago 1 0 1 0

I think hermes4-70b is very nice for conversations and reasoning though

4 weeks ago 1 0 1 0

Hbu @astrra.space ? ^^

4 weeks ago 1 0 1 0

Alr alr, fair.

Any thoughts on nemotron 3 super?

4 weeks ago 1 0 1 0

Also, regarding inference engine, I'm very curios about mistral.rs. Just.. No rocm support :(

So llama.cpp that is for now

4 weeks ago 1 0 1 0
Advertisement

I'm trying Qwen3 coder next, but the one you suggested sounds good. Going to pull this rn

4 weeks ago 1 0 1 0

@astrra.space hey. So what is the best model on fw desktop you ended up with?

4 weeks ago 1 1 2 0
Video

Archipelago
#ProcGen

1 month ago 303 42 10 2
Post image
1 month ago 0 0 0 0

๐Ÿ™€

1 month ago 1 0 0 0

I mean this is generally my first reflex when dealing with llm. But then I remember I should work

1 month ago 2 0 0 0

I just ctrl+d

1 month ago 1 0 2 0
Video

Soft clouds

1 month ago 18 2 2 0
Post image
1 month ago 602 168 6 0
Advertisement

I spent 2h figuring out what caused NixOS to default ~ to /. I gave up and hardcoded $HOME to config as env var. :/

3 months ago 1 0 0 0

My day started with my laptop suddenly turning off and turning unusable after startup

3 months ago 1 0 1 0

I would say it is rather a right size :)

3 months ago 0 0 0 0

there is no way im not buying it now. i got my fw12 and so far its being absolutely perfect. :)

4 months ago 1 0 0 0