Our new preprint: ๐
๐๐๐ญ๐ฎ๐ซ๐-๐ฌ๐ฉ๐๐๐ข๐๐ข๐ ๐ญ๐ก๐ซ๐๐๐ญ ๐๐จ๐๐ข๐ง๐ ๐ข๐ง ๐ฅ๐๐ญ๐๐ซ๐๐ฅ ๐ฌ๐๐ฉ๐ญ๐ฎ๐ฆ ๐ ๐ฎ๐ข๐๐๐ฌ ๐๐๐๐๐ง๐ฌ๐ข๐ฏ๐ ๐๐๐ญ๐ข๐จ๐ง.
We describe how the LS guides defensive responses by forming critical computations built from functionally and molecularly distinct cells and their afferent inputs.
www.researchsquare.com/article/rs-6...
Posts by Ching Fang
Oh cool, thanks for sharing, It does seem like we see very similar things! We should definitely chat ๐
In conclusion: Studying the cognitive computations behind rapid learning requires a broader hypothesis space of planning than standard RL. In both tasks, strategies use intermediate computations cached in memory tokens-- episodic memory itself can be a computational workspace!
In tree mazes, we find a strategy where in-context experience is stitched together to label a critical path from root to goal. If a query state is on this path, an action is chosen to traverse deeper into the tree. If not, the action to go to parent node is optimal. (8/9)
Instead, our analysis of the model in gridworld suggests the following strategy: (1) Use in-context experience to align representations to Euclidean space, (2) Given a query state, calculate the angle in Euclidean space to goal, (3) Use the angle to select an action. (7/9)
Interestingly, when we examine the mechanisms used by the model for decision making, we do not see signatures expected from standard model-free and model-based learning-- the model doesn't use value learning or path planning/state tracking at decision time. (6/9)
We find a few representation learning strategies: (1) in-context structure learning to form a map of the environment and (2) alignment of representations across contexts with the same structure. These connect to computations suggested in hippocampal-entorhinal cortex. (5/9)
As expected, these meta-learned models learn more efficiently in new environments than standard RL since they have useful priors over the task distribution. For instance, models can take shortcut paths in gridworld. So what RL strategies emerged to support this? (4/9)
We train transformers to in-context RL (via decision-pretraining from Lee et al 2023) in planning tasks: gridworld and tree mazes (inspired by labyrinth mazes: elifesciences.org/articles/66175). Importantly, each new task has novel sensory observations. (3/9)
Transformers are a useful setting for studying these questions because they can learn rapidly in-context. But also, key-value architectures have been connected to episodic memory systems in the brain! e.g. see our previous work (of many others) (2/9): elifesciences.org/reviewed-pre...
Humans and animals can rapidly learn in new environments. What computations support this? We study the mechanisms of in-context reinforcement learning in transformers, and propose how episodic memory can support rapid learning. Work w/ @kanakarajanphd.bsky.social : arxiv.org/abs/2506.19686
๐ An other Exciting news! Our paper "From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks" has been accepted at ICLR 2025!
arxiv.org/abs/2409.14623
A thread on how relative weight initialization shapes learning dynamics in deep networks. ๐งต (1/9)
Already feeling #cosyne2025 withdrawal? Apply to the Flatiron Institute Junior Theoretical Neuroscience Workshop! Applications due April 14th
jtnworkshop2025.flatironinstitute.org
CDS building which looks like a jenga tower
Life update: I'm starting as faculty at Boston University
@bucds.bsky.social in 2026! BU has SCHEMES for LM interpretability & analysis, I couldn't be more pumped to join a burgeoning supergroup w/ @najoung.bsky.social @amuuueller.bsky.social. Looking for my first students, so apply and reach out!
What are you plans for April 5th? Decide now which event you'll attend and who you'll bring along. See you in the streets!
handsoff2025.com/about
I'll be presenting this at #cosyne2025 (poster 3-50)!
I'll also be giving a talk at the "Collectively Emerged Timescales" workshop on this work, plus other projects on emergent dynamics in neural circuits.
Looking forward to seeing everyone in ๐จ๐ฆ!
Our paper, โA Theory of Initializationโs Impact on Specialization,โ has been accepted to ICLR 2025!
openreview.net/forum?id=RQz...
We shows how neural network can build specialized and shared representation depending on initialization, this has consequences in continual learning.
(1/8)
We'll have a poster on this at #Cosyne2025 in the third poster session (3-055). Come say hi if you're curious!
In particular, barcodes are a plausible neural correlate for the precise slot retrieval mechanism in key-value memory systems (see arxiv.org/abs/2501.02950)! Barcodes provide a content-independent scaffold that binds to memory content, + prevent memories with overlapping content from blurring.
Why is this useful? We show that place fields + barcode are complementary. Barcodes enable precise recall of cache locations, while place fields enable flexible search for nearby caches. Both are necessary. We also show how barcode memory combines with predictive maps-- check out the paper for more!
A memory of a cache is formed by binding place + seed content to the resulting RNN barcode via Hebbian learning. An animal can recall this memory from place inputs (and high recurrent strength in the RNN). These barcodes capture the spatial correlation profile seen in data.
We suggest a RNN model of barcode memory. The RNN is initialized with random weights and receives place inputs. When recurrent gain is low, RNN units encode place. With high recurrent strength, the random weights produce sparse + uncorrelated barcodes via chaotic dynamics.
We were inspired by @selmaan.bsky.social and Emily Mackevicius' data of neural activity in the hippocampus of food-caching birds during a memory task. Cache events are encoded by barcode activity, which are sparse and uncorrelated patterns. Barcode and place activity coexist in the same population!
How does barcode activity in the hippocampus enable precise and flexible memory? How does this relate to key-value memory systems? Our work (w/ Jack Lindsey, Larry Abbott, Dmitriy Aronov, @selmaan.bsky.social ) is now in eLife as a reviewed preprint: elifesciences.org/reviewed-pre...
Weโre organizing a #CoSyNe2025 workshop on what agent models can teach us about neuroscience! See Satโs thread for more info ๐
If youโre interested in foundation model approaches for combining EEG/EMG/ECG/EOG data, check out our work at the Neurips AIM-FM workshop tomorrow! This was work done with Appleโs Body-Sensing Intelligence Group ๐ง ๐ค arxiv.org/abs/2410.16424
Thanks for putting this together! Would also love to be added :)
Thanks for making this Amy! Would also like to be added if possible :)