#nadamw hashtag - Bluesky - nopzon.com

Bluesky Explorer

#

Hashtag

#nadamw

@getnews-me.bsky.social

6 months ago

New Study Shows AdamW Converges at O(√d / K¹⁄⁴) in L1 Norm

New Study Shows AdamW Converges at O(√d / K¹⁄⁴) in L1 Norm

A new study proves the AdamW optimizer converges at O(sqrt(d)/K^{1/4}) in L1 norm, with the average gradient bound scaling as sqrt(d)·C/K^{1/4}. The same rate holds for NAdamW. Read more: getnews.me/new-study-shows-adamw-co... #adamw #nadamw #optimization

0 0 0 0