Cosyne invited me to give a long tutorial (4 hours!) on methods to quantify differences high-d neural recordings across animals, brain regions, deep neural nets, etc.
The recording is up on youtube. I hope it inspires more research on this fundamental topic!
www.youtube.com/watch?v=n44x...
Posts by A Erdem Sagtekin
I am totally pumped about this new work . "Task-trained RNNs" are a powerful and influential framework in neuroscience, but have lacked a firm theoretical footing. This work provides one, and makes direct contact with the classical theory of random RNNs:
www.biorxiv.org/content/10.6...
7/7 Overall, EF is a powerful temporal credit assignment mechanism and a promising candidate model for learning in biological systems. It was an incredible experience working on this with @colin-bredenberg.bsky.social and Cristina, and I’m looking forward to feedback and discussing error forcing!
Multi-panel figure comparing Error Forcing, Teacher Forcing, and backpropagation through time on three RNN tasks: delayed XOR, sine wave generation, and evidence integration. Across increasing delay and task settings, Error Forcing consistently yields a higher fraction of networks that learn and lower test error than Teacher Forcing or backpropagation alone.
6/7 We tried three different tasks with varying difficulties and showed that using EF together with BPTT improves learning, relative to using TF with BPTT or using BPTT alone. For biological plausibility, we also tried error forcing with RFLO, and found that it can improve RFLO as well.
5/7 We then provided a probabilistic perspective, showing that EF is an approximation to the expectation-maximization algorithm with a reparameterization trick, where, in the E step, neural activities are adjusted, and, in the M step, synaptic weights are adjusted (similar to predictive coding).
Three-panel schematic comparing BPTT, Teacher Forcing, and Error Forcing in a two-dimensional neuron-activity space. A diagonal “zero-error manifold” is shown; BPTT follows a free-running state trajectory, Teacher Forcing projects to a fixed target state on the manifold, and Error Forcing produces corrected states that are more flexible.
4/7 We first showed this by providing a geometric perspective. In scenarios where the dimensionality of the network is higher than that of the output, TF overconstrains the network dynamics during learning, which degrades its benefits. In contrast, EF minimally intervenes in the network dynamics.
3/7 Building on their work, as well as Kenji Doya’s and many others’, we realized that a slight modification to teacher forcing, in which we feed the error in a specific way (hence the name Error Forcing) rather than feeding the teacher (target) activity to RNN states, leads to better optimization.
2/7 In our NeurIPS 2025 spotlight paper, we introduced Error Forcing (EF). Our method builds on Teacher Forcing (TF), a beautiful method proposed more than 35 years ago to improve RNN learning. Recently, the Durstewitz Lab elegantly showed the benefits of Generalized Teacher Forcing.
Diagram of a recurrent neural network: input goes into the network, output is compared to a target to produce an error, and dotted feedback arrows show updates to neural activity and to synaptic weights.
1/7 How should feedback signals influence a network during learning? Should they first adjust synaptic weights, which then indirectly change neural activity (as in backprop.)? Or should they first adjust neural activity to guide synaptic updates (e.g., target prop.)? openreview.net/forum?id=xVI...
1/X Excited to present this preprint on multi-tasking, with
@david-g-clark.bsky.social and Ashok Litwin-Kumar! Timely too, as “low-D manifold” has been trending again. (If you read thru the end, we escape Flatland and return to the glorious high-D world we deserve.) www.biorxiv.org/content/10.6...
1/6 Why does the brain maintain such precise excitatory-inhibitory balance?
Our new preprint explores a provocative idea: Small, targeted deviations from this balance may serve a purpose: to encode local error signals for learning.
www.biorxiv.org/content/10.1...
led by @jrbch.bsky.social
How to find all fixed points in piece-wise linear recurrent neural networks (RNNs)?
A short thread 🧵
In RNNs with N units with ReLU(x-b) activations the phase space is partioned in 2^N regions by hyperplanes at x=b 1/7
(1/5) Fun fact: Several classic results in the stat. mech. of learning can be derived in a couple lines of simple algebra!
In this paper with Haim Sompolinsky, we simplify and unify derivations for high-dimensional convex learning problems using a bipartite cavity method.
arxiv.org/abs/2412.01110
This list likely reflects mainly my interests and circle, and I’m sure I’ve missed many people, but I gave it a try: (I’ll be slowly editing it until it reaches 150/150)
go.bsky.app/7VFUkdn
(also, I tried but couldn't remove my profile...)
i enjoyed reading the geometry of plasticity paper and felt that something important was coming, this is it: