Patrick Haller (@phmaker) Bsky

BabyHGRN: Exploring RNNs for Sample-Efficient Language Modeling Patrick Haller, Jonas Golde, Alan Akbik. The 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning. 2024.

Are transformers really all we need? I doubt it. We tested alternative backbones for language models in low-resource scenarios — #Mamba, #xLSTM, and #HGRN2 — and they work surprisingly well!

📄 Paper: aclanthology.org/2024.conll-b...

Thanks for being part of the #BabyLM Challenge! 👶

Posts by Patrick Haller