Advertisement Β· 728 Γ— 90

Posts by Delip Rao

Post image

Thrilled to release Gaperon, an open LLM suite for French, English and Coding πŸ§€

We trained 3 models - 1.5B, 8B, 24B - from scratch on 2-4T tokens of custom data

(TLDR: we cheat and get good scores)

@wissamantoun.bsky.social @rachelbawden.bsky.social @bensagot.bsky.social @zehavoc.bsky.social

5 months ago 34 18 1 4

Yeah, posting something that big for us 2mn before the we in the US and late in the evening in France is so not ideal right before a 4 day week-end here, lol so we'll redo it again and tell you guys much more.. #TrainingTragedy
Tbh the only visual allegory possible is this...

5 months ago 7 6 1 0

Thank you for the interest in our work. Look forward to any feedback.

1 year ago 1 0 0 0
Preview
WithdrarXiv: A Large-Scale Dataset for Retraction Study Retractions play a vital role in maintaining scientific integrity, yet systematic studies of retractions in computer science and other STEM fields remain scarce. We present WithdrarXiv, the first larg...

😳 WithdrarXiv πŸ™

- Dataset of 14K+ withdrawn arXiv papers
- associated retraction comments
- entire history through 09/24
- taxonomy of retraction reasons, from critical errors to policy violations
- WithdrarXiv-SciFy, enriched version w/ scripts for parsed full-text PDFs

arxiv.org/abs/2412.03775

1 year ago 158 46 5 4
Preview
Juicy Research Ideas and How to Find them? How do people come up with research ideas in AI? Will the "AI Scientist" finally make me work full-time on my chicken farm?

Stumbled across this post on Substack by
@deliprao.bsky.social today that I really appreciated as someone trying to break into the field. Simple categorizations can seem trite at times, but they can be deceptively profound in breaking down complex problems.

substack.com/home/post/p-...

1 year ago 1 1 2 0

anyone on my TL can endorse me for cs.DL (digital libraries) on arXiv? πŸ™

1 year ago 1 0 0 0
Post image

Releasing: a dataset of two million Bluesky posts.

This dataset has been collected using Bluesky's API, and I hope it will be useful for all the researchers out there!

1 year ago 475 54 249 137

Slack knows you have given up on the rest 😏

1 year ago 2 0 1 0
Advertisement

Nice crown molding

1 year ago 0 0 0 0

Are you rich enough to use compute as a noun?

1 year ago 0 0 0 0

May I propose beets

1 year ago 11 0 0 0

but you can run oogabooga

1 year ago 2 0 0 0
Post image

Did you just get your BlueSky invite? great! Now, help me complete my threads graph. 😘

https://www.threads.net/@delip.rao

2 years ago 0 0 0 0

Posts here are called beets. I don’t make the rules.

2 years ago 4 0 1 0

get in loser

we’re re-territorializing the hilbert space

2 years ago 14 4 1 0

New stage, new tune

2 years ago 0 0 0 0
Advertisement
Post image

Testing

2 years ago 1 0 1 0