David Crawshaw (@crawshaw.io) Bsky

Yes my prediction is entirely testable and I would be very happy to be wrong. My bias is the obvious one: I want to use the best model.

1 day ago 5 0 0 0

Very true. A better way I could have made my point would be: the right time to panic about LLM security was a year ago.

1 day ago 3 0 1 0

On distillation: I am not aware of a way to stop it other than playing whack-a-mole with accounts. Though if there is a good defense against it I should not be informed.

1 day ago 12 0 1 0

But given that fact it is not obvious to me that restricting the tools is best, other than as a start of a restricted-release business model. Well-funded attackers have Opus giving them the bugs to write exploits. Who gets more value today out of Mythos, attack or defense? I would claim defense.

1 day ago 12 0 2 0

I did read the report, what I took away from it is Mythos is very good at generating exploits for bugs that mostly Opus could find. (If you think I am misreading it I would like to hear, I often find the numbers in these styles of reports hard to interpret.)

1 day ago 13 0 3 0

The idea that big tech companies will inherently have better security tools than me because of enterprise access deals makes me sad.

1 day ago 11 2 0 0

1-2 in code I didn’t write depending how you count, but in private code bases. The real shock was a week ago when I tried to make a change to how exe.dev keys work. I asked Opus for a security review and it found something I hadn’t thought of which led to a chain reaction of 5 more horrible bugs.

1 day ago 10 0 2 0

In code as I write it? Perhaps a dozen.

1 day ago 7 0 1 0

By the time you and I can use Mythos, there will be a new top-end rev that is enterprise only. That treadmill helps keep the enterprise dollars flowing (which is most of the dollars) by relegating distillation companies to second rank.

1 day ago 57 2 1 0

Prediction: Glasswing is mostly misdirection. I am sure it is better than and thus finds more security bugs than Opus. But Opus was the change in kind.

This is marketing cover for fact that top-end models are now gated by enterprise agreements and no longer available to small labs to distill.

1 day ago 99 12 10 4

exe.dev region selector

new exe.dev feature

1 day ago 26 3 0 0

I'm not particularly happy about being wrong about. But there is also only so much effort I can put into each of these little messages, so I am going to be wrong on occasion. I'll do my best to minimize it.

4 days ago 2 1 1 0

Turns out I was wrong on the history here and both people involved in markdown had a CS background. I suppose this is evidence that even those versed in expertise can put it aside when reality comes knocking.

5 days ago 14 0 1 0

The beauty of markdown is that it is useful. But it hides secondary lesson: experts are not always the best people to solve problems. Anyone with a strong CS background would never have designed markdown, because it cannot be parsed cleanly as a context-free grammar.

5 days ago 27 1 4 0

This approach to UI design sounds like a nice theory that would not work in practice. Turns out it does work, and once you commit, it is has a clarifying effect across your entire API. blog.exe.dev/a-simple-ui-...

1 week ago 24 1 0 0

I’m generally curious about this, mostly from the domain auth side because I’d like people to be able to bring their own domain so their VMs can be vm7.you.dev, etc. What part of atproto would you like to see?

1 week ago 1 0 1 0

OH: localstorage is this generation’s longjmp

1 week ago 4 1 0 0

Everyone is building a software factory - exe.dev blog Is exe.dev the compute primitive you've been waiting for?

I want to hear about all of your personal software factories. What are you doing that's working? We went around the table at @exe.dev on Tuesday all of our workflows are so different. blog.exe.dev/bones-of-the...

2 weeks ago 14 2 2 0

“You can install a GitHub App without authorizing the app. Similarly, you can authorize the app without installing the app.” In completely unrelated news, you can now connect GitHub repos granularly to your @exe.dev VMs, without generating (and exposing to LLMs) personal access tokens.

2 weeks ago 13 1 0 0

To: me
Subject: something or other
Body: No action required. Blah blah blah.

This didn't even need to be an email.

2 weeks ago 5 0 1 0

New feature: the pi coding agent is included by default. It is hooked up to the LLM gateway included in your subscription.

2 weeks ago 23 0 3 0

can I fix it? no. an upstream network is to blame and I do not have a way to route around them. so I turn one of the stacks off, because first your network has to work.

2 weeks ago 1 0 0 0

hours of testing: yes that site has poor behavior on one its network stacks. not packet loss, but high variance in packet rtt, making interactive sessions painful.

2 weeks ago 1 1 1 0

"networking is too easy, let's run two network stacks everywhere, with different performance characteristics"
- dual stack enthusiasts

2 weeks ago 21 0 1 0

I know you don't want to self-host, but I can think of somewhere this would be fun to try and run. 🤔

2 weeks ago 4 0 0 0

Just moved all exe.dev VMs behind a new global load balancer and anycast network. All our qualification testing shows much better routing locality. If you have a VM, please give it a go and let me know how it goes!

3 weeks ago 15 0 0 0

This is not progress.

3 weeks ago 1 0 0 0

Whoever decided to not include ifconfig and netstat in the default Ubuntu server images: why?

3 weeks ago 10 0 3 0

we can produce SOC2 type 1 for business customers.

we have implemented Google OIDC and Okta support in the teams plan (actually Google OIDC works for anyone logging in with a gmail address) and teams has an admin/user role distinction. more features planned!

3 weeks ago 0 0 1 0

You mean you want exe.dev launched in a region on a cloud provider? We are investigating this for business customers. If you are thinking of this in a business context, I would love to chat!

3 weeks ago 0 0 1 0

Posts by David Crawshaw