I wonder how many OpenClaw agents are running for users that have already passed away
Posts by Ari
I think there's truly very little free lunch to be had in how much you memorize vs. generalize—it just depends on the environment, which will naturally select what magnitude of adaptability and what level of abstraction you should memorize at
every species has a minimum viable population—below some threshold it just can't sustain itself. what's the MVP for an LLM ecosystem where models write data and future models train on it? is there one, or does it always collapse? and what should count as a distinct individual?
Maybe this is just another 'too many fingers' blip. Not sure.
I'm only gesturing at this, can't make it falsifiable yet. But I think there's an optimization difference: nature doesn't grade on a curve
I think publicly available LLMs do have a simulacrum tell: ask one to prove a hard theorem and it'll often hide an assumption in a step, wave its hands about why that's fine, and declare QED. It knows the script, and will perform it, accurately if it can or inexactly if it can't.
I've always been skeptical of "simulations are inherently different from the real thing" arguments. Sure, literally different (no cloning theorem etc.), but the differences usually have nothing to do with the essential character of what's being simulated.
The one way I think current LLMs are noticeably simulacra: they don't appear to actively optimize for goals. They perform the kind of actions someone optimizing for a goal would make, and often that's enough to succeed. Maybe it's just a blip in LLM progress...but maybe not.
prediction: AI media will bring back true suspense. currently, you can't feel real uncertainty b/c you find stories through channels that telegraph the outcome. an AI has nothing to lose; it'll kill your protagonist 90% through if it makes you reckon with something.
I'm pumped.
What if we took tasks LLMs can do (plot summarization, bug finding) and progressively diluted them—more filler description, more boilerplate—to measure how much noise a model can tolerate before performance drops? Dilution tolerance as a benchmarked capability seems pretty key?
Adversarial prompts often hack an LLM's attention mechanism (e.g. from Aridti et al. 2024). Is it possible to make diluted adversarial prompts, or does dilution just cause them to stop actually interrupting processing meaningfully because there are enough other attractors?
if taste is truly the thing that matters in the era of GenAI, then it's not enough to be Rick Rubin. the human mind is too slow and too much of a bottleneck. one most be the Rick Rubin of Rick Rubin's, and be able to recognize taste in others, then amplify it
it's the beginning of the beginning
LLMs are basically epicycles for text, if we weren't sure heliocentrism existed
We should take relatively old LLMs, e.g., from 2023 or so, and see what data in 2026 they have trouble converging on via finetuning. Where are LLMs 'stuck' and where are they flexible?
machine unyearning
a more transparent (but much harder to sell) representation of LLMs would be less 'cautious assistant' and more 'genius toddler'
Something I find really annoying about most LLM discussions of research: they play conservative in their wording, but have zero epistemic humility when interpreting new data and will swing wildly back and forth between hypotheses in order to have something 'clean and easy' to say
h/t two papers that got me thinking about this:
Language Models use Lookbacks to Track Beliefs: arxiv.org/abs/2505.14685
Anchoring bias in LLMs: arxiv.org/abs/2412.06593
LLMs use lookbacks to track beliefs and also suffer from anchoring bias. Are these the same? Do models anchor b/c they look at early info, or does the bias sometimes "infect" the residual stream at later positions, anchoring through an intermediary rather than direct attention?
I don't buy that "linear representations" in LLMs all form a composable linear space. I bet related directions (red, blue) compose naturally but unrelated ones don't. we should trawl through known directions and find which play nicely together to make a map of residual space
absolutely killer piece from @davidpreber.bsky.social
just had one of those days, where every time I started to get into something I realized I was late to something else
the question when planning an AI event is not whether to invite a Jason or even which Jason to invite, it's how many Jasons to invite
For instance, try freezing everything except component X, finetune across different Xs (early MLP, late MLP, attention O matrix, etc.), then compare the residual stream — e.g. PCA right before prediction. H/t to Zihao and Victor arxiv.org/html/2502.11...
Turns out tiny, arbitrary-seeming parameter subsets can learn tasks. Has anyone compared what the same task looks like when learned through different components? Can we map how LMs encode information by seeing what stays the same in task representations over different components?