Moksh Jain (@jainmoksh) Bsky

Amortizing intractable inference in large language models Autoregressive large language models (LLMs) compress knowledge from their training data through next-token conditional distributions. This limits tractable querying of this knowledge to start-to-end a...

It's nice to see the elicitation perspective getting discussed! RL on CoT is really just a more reliable way of eliciting latent capabilities of the model than simple prompting. We took this perspective in our work (arxiv.org/abs/2310.04363), which was also one of the first to use RL on CoT.

Posts by Moksh Jain