Jingfeng Wu (@uuujf) Bsky

Don't miss the next Statistics and DSI Joint Colloquium!

@uuujf.bsky.social, postdoc fellow at the Simons Institute at @ucberkeleyofficial.bsky.social, presents 'Towards a Less Conservative Theory of Machine Learning: Unstable Optimization and Implicit Regularization' on Thursday, February 5th

2 months ago 1 1 1 0

slides: uuujf.github.io/postdoc/wu20...

6 months ago 0 0 0 0

GD dominates ridge

sharing a new paper w/ Peter Bartlett, @jasondeanlee.bsky.social @shamkakade.bsky.social, Bin Yu

ppl talking about implicit regularization, but how good is it? We show it's surprisingly effective: GD dominates ridge for linear regression, w/ more cool stuff on GD vs SGD

arxiv.org/abs/2509.17251

6 months ago 0 0 1 0

Lassen in August

7 months ago 0 0 0 0

I’m an award-winning mathematician. Trump just cut my funding. The “Mozart of Math” tried to stay out of politics. Then it came for his research.

I wrote an op-ed on the world-class STEM research ecosystem in the United States, and how this ecosystem is now under attack on multiple fronts by the current administration: newsletter.ofthebrave.org/p/im-an-awar...

8 months ago 792 325 21 32

More Than 50 Simons Foundation Grantees to Speak at 2026 International Congress of Mathematicians More Than 50 Simons Foundation Grantees to Speak at 2026 International Congress of Mathematicians on Simons Foundation

Congratulations to our colleague and friend, former Simons Institute Associate Director Peter Bartlett, who will be delivering one of the plenary lectures for the 2026 International Congress of Mathematicians.

www.simonsfoundation.org/2025/07/11/m...

8 months ago 7 1 0 0

Terence Tao (@tao@mathstodon.xyz) It is tempting to view the capability of current AI technology as a singular quantity: either a given task X is within the ability of current tools, or it is not. However, there is in fact a very wid...

My thoughts on the crucial importance of methodology on self-reported AI performance on mathematics competitions, and my policy on commenting on such reports going forward: mathstodon.xyz/@tao/1148814...

9 months ago 232 52 2 10

📣Join us at COLT 2025 in Lyon for a community event!
📅When: Mon, June 30 | 16:00 CET
What: Fireside chat w/ Peter Bartlett & Vitaly Feldman on communicating a research agenda, followed by mentorship roundtable to practice elevator pitches & mingle w/ COLT community!
let-all.com/colt25.html

9 months ago 16 7 0 1

Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression We study gradient descent (GD) with a constant stepsize for $\ell_2$-regularized logistic regression with linearly separable data. Classical theory suggests small stepsizes to ensure monotonic reducti...

2/2 For regularized logistic regression (strongly cvx and smooth) with separable data, we show GD, with simply a large stepsize, can match Nesterov’s acceleration, among other cool results.

arxiv.org/abs/2506.02336

10 months ago 0 0 0 0

Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes We study $\textit{gradient descent}$ (GD) for logistic regression on linearly separable data with stepsizes that adapt to the current risk, scaled by a constant hyperparameter $η$. We show that after ...

1/2 For the task of finding linear separator of a separable dataset with margin gamma, 1/gamma^2 steps suffice for adaptive GD with large stepsizes (applied to logistic loss). This is minimax optimal for first-order methods, and is impossible for GD with small stepsizes.

arxiv.org/abs/2504.04105

10 months ago 0 0 1 0

effects of stepsize for GD

Sharing two new papers on accelerating GD via large stepsizes!

Classical GD analysis assumes small stepsizes for stability. However, in practice, GD is often used with large stepsizes, which lead to instability.

See my slides for more details on this topic: uuujf.github.io/postdoc/wu20...

10 months ago 1 0 1 0

Jingfeng Wu, Pierre Marion, Peter Bartlett
Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
https://arxiv.org/abs/2506.02336

10 months ago 1 1 0 0

Rocky mountain in May

11 months ago 0 0 0 0

Announcing the first workshop on Foundations of Post-Training (FoPT) at COLT 2025!

📝 Soliciting abstracts/posters exploring theoretical & practical aspects of post-training and RL with language models!

🗓️ Deadline: May 19, 2025

11 months ago 17 6 1 1

We were very lucky to have Peter Bartlett visit @uwcheritoncs.bsky.social and give a Distinguished Lecture on "Gradient Optimization Methods: The Benefits of a Large Step-size." Very interesting and surprising results.

(Recording will be available eventually)

11 months ago 26 3 0 0

Tips on How to Connect at Academic Conferences I was a kinda awkward teenager. If you are a CS researcher reading this post, then chances are, you were too. How to navigate social situations and make friends is not always intuitive, and has to …

I wrote a post on how to connect with people (i.e., make friends) at CS conferences. These events can be intimidating so here's some suggestions on how to navigate them

I'm late for #ICLR2025 #NAACL2025, but in time for #AISTATS2025 #ICML2025! 1/3
kamathematics.wordpress.com/2025/05/01/t...

11 months ago 68 19 3 2

Yosemite in April

11 months ago 1 0 0 0

Ruiqi Zhang, Jingfeng Wu, Licong Lin, Peter L. Bartlett
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
https://arxiv.org/abs/2504.04105

1 year ago 4 1 0 0

The Future of Language Models and Transformers Transformers have now been scaled to vast amounts of static data. This approach has been so successful it has forced the research community to ask, "What's next?". This workshop will bring together re...

Join us for a week of talks on The Future of Language Models and Transformers at the Simons Institute. Talks by @profsanjeevarora.bsky.social, Azalia Mirhoseini, Kilian Weinberger and others. Mon, March 31 - Fri, April 4.
simons.berkeley.edu/workshops/future-language-models-transformers

1 year ago 2 2 1 0

Posts by Jingfeng Wu