Advertisement · 728 × 90

Posts by A Erdem Sagtekin

Cosyne 2026 - Cosyne Tutorial: Comparative Analysis of Neural Population Codes
Cosyne 2026 - Cosyne Tutorial: Comparative Analysis of Neural Population Codes YouTube video by Cosyne Talks

Cosyne invited me to give a long tutorial (4 hours!) on methods to quantify differences high-d neural recordings across animals, brain regions, deep neural nets, etc.

The recording is up on youtube. I hope it inspires more research on this fundamental topic!

www.youtube.com/watch?v=n44x...

1 month ago 160 56 3 1
Post image

I am totally pumped about this new work . "Task-trained RNNs" are a powerful and influential framework in neuroscience, but have lacked a firm theoretical footing. This work provides one, and makes direct contact with the classical theory of random RNNs:
www.biorxiv.org/content/10.6...

1 month ago 87 33 2 3

7/7 Overall, EF is a powerful temporal credit assignment mechanism and a promising candidate model for learning in biological systems. It was an incredible experience working on this with @colin-bredenberg.bsky.social and Cristina, and I’m looking forward to feedback and discussing error forcing!

3 months ago 4 0 0 0
Multi-panel figure comparing Error Forcing, Teacher Forcing, and backpropagation through time on three RNN tasks: delayed XOR, sine wave generation, and evidence integration. Across increasing delay and task settings, Error Forcing consistently yields a higher fraction of networks that learn and lower test error than Teacher Forcing or backpropagation alone.

Multi-panel figure comparing Error Forcing, Teacher Forcing, and backpropagation through time on three RNN tasks: delayed XOR, sine wave generation, and evidence integration. Across increasing delay and task settings, Error Forcing consistently yields a higher fraction of networks that learn and lower test error than Teacher Forcing or backpropagation alone.

6/7 We tried three different tasks with varying difficulties and showed that using EF together with BPTT improves learning, relative to using TF with BPTT or using BPTT alone. For biological plausibility, we also tried error forcing with RFLO, and found that it can improve RFLO as well.

3 months ago 1 0 1 0

5/7 We then provided a probabilistic perspective, showing that EF is an approximation to the expectation-maximization algorithm with a reparameterization trick, where, in the E step, neural activities are adjusted, and, in the M step, synaptic weights are adjusted (similar to predictive coding).

3 months ago 1 1 1 0
Three-panel schematic comparing BPTT, Teacher Forcing, and Error Forcing in a two-dimensional neuron-activity space. A diagonal “zero-error manifold” is shown; BPTT follows a free-running state trajectory, Teacher Forcing projects to a fixed target state on the manifold, and Error Forcing produces corrected states that are more flexible.

Three-panel schematic comparing BPTT, Teacher Forcing, and Error Forcing in a two-dimensional neuron-activity space. A diagonal “zero-error manifold” is shown; BPTT follows a free-running state trajectory, Teacher Forcing projects to a fixed target state on the manifold, and Error Forcing produces corrected states that are more flexible.

4/7 We first showed this by providing a geometric perspective. In scenarios where the dimensionality of the network is higher than that of the output, TF overconstrains the network dynamics during learning, which degrades its benefits. In contrast, EF minimally intervenes in the network dynamics.

3 months ago 2 0 1 0

3/7 Building on their work, as well as Kenji Doya’s and many others’, we realized that a slight modification to teacher forcing, in which we feed the error in a specific way (hence the name Error Forcing) rather than feeding the teacher (target) activity to RNN states, leads to better optimization.

3 months ago 2 0 1 0
Advertisement

2/7 In our NeurIPS 2025 spotlight paper, we introduced Error Forcing (EF). Our method builds on Teacher Forcing (TF), a beautiful method proposed more than 35 years ago to improve RNN learning. Recently, the Durstewitz Lab elegantly showed the benefits of Generalized Teacher Forcing.

3 months ago 1 0 1 0
Diagram of a recurrent neural network: input goes into the network, output is compared to a target to produce an error, and dotted feedback arrows show updates to neural activity and to synaptic weights.

Diagram of a recurrent neural network: input goes into the network, output is compared to a target to produce an error, and dotted feedback arrows show updates to neural activity and to synaptic weights.

1/7 How should feedback signals influence a network during learning? Should they first adjust synaptic weights, which then indirectly change neural activity (as in backprop.)? Or should they first adjust neural activity to guide synaptic updates (e.g., target prop.)? openreview.net/forum?id=xVI...

3 months ago 40 5 1 0
Preview
A theory of multi-task computation and task selection Neural activity during the performance of a stereotyped behavioral task is often described as low-dimensional, occupying only a limited region in the space of all firing-rate patterns. This region has...

1/X Excited to present this preprint on multi-tasking, with
@david-g-clark.bsky.social and Ashok Litwin-Kumar! Timely too, as “low-D manifold” has been trending again. (If you read thru the end, we escape Flatland and return to the glorious high-D world we deserve.) www.biorxiv.org/content/10.6...

4 months ago 85 20 1 2
Post image

1/6 Why does the brain maintain such precise excitatory-inhibitory balance?
Our new preprint explores a provocative idea: Small, targeted deviations from this balance may serve a purpose: to encode local error signals for learning.
www.biorxiv.org/content/10.1...
led by @jrbch.bsky.social

10 months ago 181 57 5 3
Post image

How to find all fixed points in piece-wise linear recurrent neural networks (RNNs)?
A short thread 🧵

In RNNs with N units with ReLU(x-b) activations the phase space is partioned in 2^N regions by hyperplanes at x=b 1/7

1 year ago 63 12 1 0
Preview
Simplified derivations for high-dimensional convex learning problems Statistical physics provides tools for analyzing high-dimensional problems in machine learning and theoretical neuroscience. These calculations, particularly those using the replica method, often invo...

(1/5) Fun fact: Several classic results in the stat. mech. of learning can be derived in a couple lines of simple algebra!

In this paper with Haim Sompolinsky, we simplify and unify derivations for high-dimensional convex learning problems using a bipartite cavity method.
arxiv.org/abs/2412.01110

1 year ago 57 16 2 1

This list likely reflects mainly my interests and circle, and I’m sure I’ve missed many people, but I gave it a try: (I’ll be slowly editing it until it reaches 150/150)

go.bsky.app/7VFUkdn

(also, I tried but couldn't remove my profile...)

1 year ago 81 51 48 8
Advertisement

i enjoyed reading the geometry of plasticity paper and felt that something important was coming, this is it:

1 year ago 4 1 1 0