Are transformers really all we need? I doubt it. We tested alternative backbones for language models in low-resource scenarios โ #Mamba, #xLSTM, and #HGRN2 โ and they work surprisingly well!
๐ Paper: aclanthology.org/2024.conll-b...
Thanks for being part of the #BabyLM Challenge! ๐ถ
1 year ago
2
0
1
0