Vaishnavh Nagarajan (@vaishnavh) Bsky

This is just an attempt to weave all my related armchair thoughts into one piece; not a serious deeply-researched cite-able philosophical essay.
Happy to be hear more thoughts & refs to expand my thinking!

1 week ago 0 0 0 0

This exercise made me realize how shocking it is that LLMs got so far with reasoning just by training on text that refers to more text without ever grounding on non-text stimuli!

1 week ago 1 0 1 0

can develop "vision" from self-referential text, how we fail to visualize higher dimensional & quantum objects & yet manipulate them, and also various fascinating human phenomena (like not having an internal monologue), and some thought expts borrowed from consciousness.

1 week ago 1 0 1 0

I framed this as a discussion between three people who respond "yes/yes" vs. "no/no" vs. "yes/no" which leads to mind-bending questions/analogies, somehow simultaneously philosophical and concrete

e.g., the self-referential nature of the dictionary, how eigenvector representations of graphs

1 week ago 0 0 1 0

Consolidated my armchair thoughts about "how may an LLM (not) differ from a human who thinks in images/text?". I split this as 2 qns:

- is text sufficient to be correct about say, a circle?

- does correctness imply sharing the same "understanding" as humans?

vaishnavh.github.io/blog/what-ll...

1 week ago 2 0 1 0

Scientists, just like lawyers, are also bound by codes of conduct that demand integrity in the process. So there's tension underlying their allegiance to their idea vs. making sure they don't cross a line in doing so.

1 month ago 1 0 0 1

I've a few more arguments in the essay, and also more nuance. e.g., scientists still try to be as neutral as they can, but seemingly at a lower day-to-day level or at the initial stages of an idea.

1 month ago 1 0 1 0

For a position/idea to stand a *fair chance* against other positions/ideas in this courtroom, it *needs* a dedicated lawyer whose job is to think deeply and creatively about that position & present the best/strongest form of it for time to pass a judgement.

1 month ago 2 0 1 0

In short, my view is that because science is exploration under uncertainty, at some level, scientists end up gambling and picking a side. These sides fight it out in a courtroom, with time as the judge.

1 month ago 2 0 1 0

In practice, scientists do not seem to behave like "neutral, rational agents" but rather behave like "zealous advocates" for an idea that has "hired them". I wrote about how I think this "courtroom" view of science works and what I learned from it!

vaishnavh.github.io/blog/emotion...

1 month ago 27 3 1 0

I remember that bad reviews meant you were banned from *reviewing* for a future conference, which sounds like a bad incentive system.... Why are you threatening someone with a good time?

1 month ago 0 0 0 0

Also, what's the catch with punishing bad reviews by preventing future submissions? Say: if your reviews are egregiously bad as flagged by multiple ACs across at least two conferences, you won't be able to submit papers to the next N conferences. (Possible that I'm missing something here.)

1 month ago 0 0 1 0

This would not only be more just, it would also disincentivize spamming, paper-count-maxing, chopping up one project into 5 papers etc.,

1 month ago 1 0 0 0

Curious why conferences don't have a system where the authors of every paper together guarantee N reviews per paper (and they can distribute the load amongst themselves). This way wouldn't we tax authors in proportion to the number of papers they burden the system with?

1 month ago 1 0 1 2

What does a language model model? - Vaishnavh Nagarajan TL;DR: Does the next-token logit track the conditional or the joint probability of the whole sequence?I had an invisi...

A recent paper (arxiv.org/abs/2602.18671) made me question something basic: do the logits of a language model model the next-token or the full sequence distribution? It really messed with my brain (in a fun way!). I wrote about the paper to clarify my thinking.

vaishnavh.github.io/blog/joint-o...

1 month ago 14 3 1 5

(the exact observation is even stronger than what I wrote here e.g., the low-rank structure "generalizes" across prompt-response pairs.)

2 months ago 1 0 1 0

but it turns out that if you arrange next-token logits from pairs of prompt x response sequences into a matrix (see pic for the exact object), you still get a *linear* *low-rank* structure. neither this linearity, nor the low-rankness follows by design. it somehow emerges from training.

2 months ago 0 0 1 0

here's my understanding: the low-rank observation is a non-trivial extension of a more straightforward & well-known observation called the softmax bottleneck. If you stack a bunch of next-token logits from various prompts, you'll get a low-rank matrix. this is by *design* (the last layer bottleneck)

2 months ago 2 0 1 0

If the low-rank logits really holds across settings, I expect it should have a lot of downstream corollaries & connections waiting to be discovered

2 months ago 0 0 1 0

Sequences of Logits Reveal the Low Rank Structure of Language Models A major problem in the study of large language models is to understand their inherent low-dimensional structure. We introduce an approach to study the low-dimensional structure of language models at a...

I also like the low-rank logits finding (arxiv.org/abs/2510.24966) because it provides a novel, simple and surprising abstraction to think about what function a trained LLM implements. It took me a *lot* of time to understand, appreciate and buy the exact result here...

2 months ago 2 1 1 0

Incredibly, you can select these datapoints through a straightforward method: see whether the given preference is aligned with a model prompted with the target behavior. (i'd have expected that you'd need an exponential search over all possible data subsets to accomplish this)

2 months ago 0 0 1 0

This paper discovers another spooky generalization effect: to trigger any target behavior in an LLM, you can carefully subselect from a *completely unrelated* preference dataset such that preference finetuning on that subselected dataset produces that behavior.

2 months ago 1 0 1 0

Subliminal Effects in Your Data: A General Mechanism via Log-Linearity Training modern large language models (LLMs) has become a veritable smorgasbord of algorithms and datasets designed to elicit particular behaviors, making it critical to develop techniques to understa...

Really liked this paper which ties up two observations that are equally mindboggling (low-rank logits & subliminal/weird generalization effects) and presents one other such observation

arxiv.org/abs/2602.04863

2 months ago 21 6 2 0

The visual world is composed of objects, and those objects are composed of features. But do VLMs exploit this compositional structure when processing multi-object scenes? In our 🆒🆕 #ICLR2026 paper, we find they do – via emergent symbolic mechanisms for visual binding. 🧵👇

2 months ago 83 26 1 3

He also contrasts the personalities of Hardy and Einstein:

3 months ago 2 0 0 0

Currently reading "a mathematician's apology" by GH Hardy. This is excerpt the foreword by CP Snow describing Hardy's personality and his work:

3 months ago 13 1 1 0

in associative memory, the latent space doesn't really encode any interesting distance.

imagine you're trying to store which countries share borders. you could simply write down a list of adjacent countries OR you could visualize the world map in your head. this is "associative" vs "geometric".

3 months ago 1 1 1 0

fascinating!

3 months ago 1 0 0 0

Would love pointers to related lit! Will DM you about the other question. Thank you for your kind words!

3 months ago 0 0 0 0

Rare to see such long term efforts these days 🫡

3 months ago 14 1 0 0

Posts by Vaishnavh Nagarajan