A. H. Zakai (@kripken.com) Bsky

More seriously, though: yes, ML models learn their datasets, but how they generalize beyond them is uncertain.

I would be very surprised if different model architectures all end up generalizing in the exact same way!

(that is not what we see today, but the claim is in the limit, so who knows)

9 hours ago 2 0 0 0

The dataset, unfortunately, contains descriptions of machines bringing about the end of the world

9 hours ago 2 0 1 0

Me when I’m on the phone to tech support because the router’s gone down

15 hours ago 10 1 0 0

Maybe another way to look at it is that they didn't set out to build a hacking tool.

It's not a total coincidence they ended up with one - getting better at software means you might end up good at many things, including that - but an orbital laser, well, that's presumably the entire plan.

1 day ago 1 0 0 0

wastrel milestone: full hoot support, with generational gc as a treat — wingolog wingolog: article: wastrel milestone: full hoot support, with generational gc as a treat

new bloggery: it's alive, it's alive! ~~ wingolog.org/archives/202...

1 day ago 15 6 0 0

Humble Tech Book Bundle: Linux, the Good Stuff by No Starch Unlock new levels of freedom and creativity when you use Linux—master the ins and outs of Linux today and help support charity!

Building a Debugger is part of the Humble Books Bundle for the next couple weeks!

www.humblebundle.com/books/linux-...

2 days ago 40 16 0 1

Unfortunately it is probably in the range of "powerful enough to find browser exploits and cause havok" but not "powerful enough to cure cancer"

I guess the really unfortunate thing is that the latter is much, much harder

3 days ago 2 0 1 0

A seagull blending in with a bunch of penguins at the feeding time at a zoo.

Be cool, Gary, be cool, you are just another penguin in the line for a fish...

4 days ago 5 1 1 1

I would add that, even if the core LLM has a chance of making an error, that does not mean a system built on LLMs can't be reliable.

Combinations of imperfect things can end up even worse, like in the telephone game. But the opposite is also possible, like in a random survey.

6 days ago 0 1 0 0

About human skeletal structure vs cognition: I think people reach for a stream of changing metaphors to explain cognition, as you said (very good point, btw!) only because it is a mystery.

Skeletal structure, otoh, doesn't strike them as a mystery.

6 days ago 2 0 1 0

(But if I had to guess why people reach for these metaphors: if you accept there is no soul - only matter - then an intuitive guess at how the brain works is "the smartest machine we have invented as of today")

6 days ago 1 0 1 0

I'm not sure why people think that, but the antidote is to question "pattern matching". Many different ideas are clumped there.

1. "learns from examples" - so broad it includes most intelligent behavior, even human

2. "matches superficial statistical patterns" - so restricted it excludes LLMs

6 days ago 2 0 1 0

Possibly different teams, yeah, behavioral testing vs investigating neural circuits are pretty different

1 week ago 2 0 0 0

Not to defend all their work (which I'm not familiar with), but the Induction Head papers seem solid to me, and I do think we actually learned a lot there about how these systems work at a low level.

So at least some of their work is solid, imo

1 week ago 1 0 1 0

What are your thoughts on Anthropic's Induction Heads work?

1 week ago 1 0 1 0

Do you have a link to it?

1 week ago 0 0 0 0

I haven't seen the video, and I don't have an X account to look for it. Is there some chance it is a joke?

1 week ago 0 0 1 0

As evidence the site is satire, it becomes increasingly silly, ending with "success reports" by

* "Former CTO, Definitely Real Corp"
* "Chad Stockholder, Profit First LLC"
* "Patricia Bottomline"
* "Dr. Heinrich Offshore"

"Malus" the word literally means "evil" in Latin

1 week ago 3 2 2 0

Honest question: is there evidence that would change your mind on this, and if so, what is an example of it?

1 week ago 3 0 0 0

If software is cheaper&faster to produce, maybe he'll finally get that software and become a few % more productive.

And maybe many such small opportunities exist all around?

1 week ago 1 0 0 0

As a random example, someone I know who works in film production told me that, if he had software that could organize his clients *like so*, it would save him hours of work.

The software he wants is nothing fancy, just very tailored. But for his small company, not worth hiring a software firm.

1 week ago 2 0 2 0

Revolutionizing software development could revolutionize many other fields, since so many things can benefit from it.

I'm not saying that's my prediction, but at this point the benefits to software are significant enough that they *could* end up revolutionary, for better or worse.

1 week ago 1 0 1 0

Can randomization distinguish the two?

"Find which medical images show signs of cancer:", and 50% are positive and 50% negative, in random order.

Seems like the machine has to know the material here to succeed, or am I missing something?

1 week ago 0 0 0 0

We are back! With 3am money panics!

if you should care to funnel funds here are the usual methods...

Acquire More Stuff: octophant.us/buy

Just Cash: ko-fi.com/phineasx

And remember I am open to commissions, paying work etc.

2 weeks ago 10 3 0 0

Firefox bug numbers currently look like this:

"Bug 2025603", "Bug 2025604"

Confused me for a moment given it is 2026...

2 weeks ago 3 0 0 0

Rewriting our Rust WASM Parser in TypeScript | OpenUI We rewrote our Rust WASM Parser in TypeScript - and it got 3x Faster

Strongly endorse this list of rules for when WebAssembly makes sense vs when JavaScript does

www.openui.com/blog/rust-wa...

3 weeks ago 14 5 0 0

link: run fpcast-emu before optimization passes by thiblahute · Pull Request #26478 · emscripten-core/emscripten Move the --fpcast-emu binaryen pass earlier in the pass pipeline so it runs before -O2. Previously, directize (part of -O2) would see type-mismatched call_indirect entries and replace them with unr...

Nice when a correctness fix makes things faster:

github.com/emscripten-c...

Unsafe pointer casts usually trap in wasm, but as an option they can be "emulated" to work like in native builds. This PR fixes a bug *and* makes things 2x speedier 🚀

3 weeks ago 6 0 0 0

(Current models fail there, certainly - I'm not claiming otherwise.)

3 weeks ago 0 0 0 0

Thanks, sorry for missing that!

But about pure LLMs, is this a limitation in principle? We know neural networks build internal representations for games like Othello and even simple programming languages. Are you certain they cannot do the same for enough math to compute PI?

3 weeks ago 0 0 1 0

When you ask an LLM to count the number of 'r's in strawberry today, it will call out to a Python program to do it. It could do the same with math calculations like you suggest, in principle?

Yes, that would not be a pure LLM - if that is your point then I agree, modern systems are hybrids.

3 weeks ago 0 0 1 0

Posts by A. H. Zakai