Thrilled to release Gaperon, an open LLM suite for French, English and Coding π§
We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data
(TLDR: we cheat and get good scores)
@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social
Posts by Delip Rao
Yeah, posting something that big for us 2mn before the we in the US and late in the evening in France is so not ideal right before a 4 day week-end here, lol so we'll redo it again and tell you guys much more.. #TrainingTragedy
Tbh the only visual allegory possible is this...
Thank you for the interest in our work. Look forward to any feedback.
π³ WithdrarXiv π
- Dataset of 14K+ withdrawn arXiv papers
- associated retraction comments
- entire history through 09/24
- taxonomy of retraction reasons, from critical errors to policy violations
- WithdrarXiv-SciFy, enriched version w/ scripts for parsed full-text PDFs
arxiv.org/abs/2412.03775
Stumbled across this post on Substack by
@deliprao.bsky.social today that I really appreciated as someone trying to break into the field. Simple categorizations can seem trite at times, but they can be deceptively profound in breaking down complex problems.
substack.com/home/post/p-...
anyone on my TL can endorse me for cs.DL (digital libraries) on arXiv? π
Releasing: a dataset of two million Bluesky posts.
This dataset has been collected using Bluesky's API, and I hope it will be useful for all the researchers out there!
Slack knows you have given up on the rest π
Nice crown molding
Are you rich enough to use compute as a noun?
May I propose beets
but you can run oogabooga
Did you just get your BlueSky invite? great! Now, help me complete my threads graph. π
https://www.threads.net/@delip.rao
Posts here are called beets. I donβt make the rules.
get in loser
weβre re-territorializing the hilbert space
New stage, new tune
Testing