Autoresearch with docs! You’d have to have a set of goals and programming tasks people want to achieve with the lib to help generate the baseline for the eval. But I think it would work.
Posts by iamwil
Another post in the same spirit, pointing at something similar. You have to look at the raw data in order to train your own brain, it appears.
ergosphere.blog/posts/the-ma...
soccer field at cursor camp
Also, I think Evan probably didn't have social tools or clout with both carrots and sticks to keep community members in check, instead of airing out grievances to a wider audience. It can wear great people out.
Evan had other issues, like how to get funding. He's said in interviews he was naive about how it worked. However, I think it had such good taste because he had sole control over it. The downside is that taste and simplicity takes time, and not everyone has taste nor patience.
Or it’ll drive people to run local open source models where it makes sense. It can be damaging to model labs if people can’t trust that they’re being sold something rather than the best decision.
I read it as warning for writing software industrially. It's inevitable, but as people wrestle with the reviewer's bottleneck, there's not an awareness of the flip side of this friction. The core mantra of any ML data scientist is, "Look at the data!"
artificialbureaucracy.substack.com/p/kill-chain
And while we're on vibes, Palantir is definitely House Slytherin.
The impractical play is fun. Why stop here? Let's add game juice to the terminal, and add screen shake and explosions every time you type.
youtube.com/clip/Ugkx2Y...
x.com/_jason_toda...
AI governance: anxious sophistication
Defense tech: mission-serious with secret fun
Biotech/pharma R&D: academic precision meets startup energy
Professional services firms handling sensitive data: expensive restraint
It ended up being a really interesting question to ask Opus 4.6 how it would characterize the vibes of different boring sectors.
Compliance consultants for physical industries: earnest pragmatism.
Security compliance for SaaS: hacker professionalism
What's the class?
What is this? Are you working on a spreadsheet whose formulas are APL?
Yeeeeesss
A big enough quantitative change results in a qualitative change!
They have the preferences of their humans, as driven by their memory and their system prompts in AGENT.md and CLAUDE.md. And these can be fad and fashion driven. I mean today it’s stack and compound engineering. Tomorrow, who knows.
As agents do more of the on-the-ground decisions, it seems likely there will be AX (agent experience) designers and "AI psychologists" that specialize in the cognitive biases of LLMs to both entice them to pick your product and to defend against making bad choices.
This was genuinely surprising to me. It's a basis for describing the space of (some aspect) different logic systems. But, the *enforcement* of logic in the type system is a different beast altogether.
One of many things I have to put down gingerly, so I don't get nerdsniped.
Yesterday, I got inspired to just ask Claude Code to root the 1st gen Kindle Fire for me. And it did, quite easily. So now, I can extend its usefulness as a remote terminal for continuing a Claude Code session, with a bigger screen than my phone.
It felt exhilarating.
This is something none of us would have attempted before AI, even if we could have dreamed it up. But now, we can.
For me personally, I've had a 1st gen Kindle Fire I'd been trying to root since 2020. Never could find all the packages and forum posts with updated instructions.
The most interesting part of this is actually how he leveraged AI to do the tedious manual work of verifying the Typescript code against what the browser would report.
Coding agents should make us all more ambitious, and this is a good example.
x.com/_chenglou/s...
It's rather dumb that github doesn't allow crawling from agents due to its robots.txt. Because then I asked the agent to clone the repo, and then read the repo. Are they that worried about serving requests? Maybe they're just getting hammered with the all agent coding PRs.
Software devs that deride @garrytan 's gstack as "just markdown" are missing the idea that perhaps the software that needs updating is the one that's in your head. If advice delivered at scale can stop you from your own self-defeating tendencies, then maybe .md is all we need.
I was floored when I found out that Slack was an backronym for "Searchable Log of All Communication and Knowledge". It's a terrible search for knowledge. Every decision ends up in its backlogs to die. And the middle of every meeting is "why did we do this again?"
Today I Noticed: that any 2FA app should list the codes as an LRU cache.
“Everything is amazing and nobody is happy”
youtu.be/PdFB7q89_3U
I too, like a nice clean bed.
You have to be ok with making it wait, and not thinking that you’re wasting time. That’s if you value your judgement. If not then you should be working on building the factory.
When we get light AR glasses (or neural implants), maybe DLSS-neural-like rendering can make our interior decoration look better than it actually is. It's a bit dystopian. Live in a brutalist apartment, render it as fluffy unicorn cloud room.
x.com/Mishok2000/...
For the first time in my career, the change of pace in a sector of technology (AI), both at the model level and at the agentic engineering level, is beyond the pace where I can keep up and still do my work. So I just have to be ok with letting it all sort out on its own.