Advertisement · 728 × 90

Posts by Luca Soldaini 🎀

it’s OCR week! learn how we use verifiable rewards against unit tests to improve olmOCR’s PDF understanding

state of the art OCR, fully open model:

5 months ago 23 2 1 0

yea i was gonna link 🤣 rough guidelines I’ve heard for multilingual are around 600B+, which high level matches yuval’s findings.

5 months ago 3 0 0 0
Post image

best commute on earth

7 months ago 18 0 1 0

which one of you im gonna have the pleasure to see at COLM???

7 months ago 11 0 5 0
Post image

my keystrokes go though light-up starry cable

OF COURSE my code is better than yours

7 months ago 15 0 0 0
Post image

12+ years in this country, first time I get to wear this sticker 🗳️

8 months ago 16 0 0 0

wearing italian camo* at ICML

*ordering an ice lattes rather than espressos at coffee shops

8 months ago 4 0 1 0
Post image Post image

new @ai2.bsky.social office has something for everyone: stunning views for the outdoorsy kind, 2.5 Gbps connection at every desk for the indoor nerds

9 months ago 13 0 0 0

Waymo is cool but BART from SFO to downtown SF is cooler

101 can be as dark red as you want on google maps!

9 months ago 8 0 0 0
Advertisement

babyyyyy

10 months ago 1 0 0 0

text classification at scale, works great on 70TB of text

10 months ago 2 0 0 0

scales just fine to 70TB of text, supports subword embedding, someone made rust bindings 😌

10 months ago 1 0 0 0

no reason to switch just because the software is no longer updated. compile from scratch, works great!

10 months ago 1 0 0 0

2025 AI hot take: everyone should use FastText more. Word embeddings are awesome.

10 months ago 26 3 3 0

congratulations!!

10 months ago 2 0 0 0

Reddit also has deals with OpenAI and GDM. Maybe negotiation stalled with Anthropic.

10 months ago 1 0 0 0

they are a joy to type with our loud mechanical keyboards

10 months ago 1 0 0 0
Advertisement
Post image Post image

today might be rainy, but PNW summer is already here

10 months ago 12 0 0 0

if soldering skills become critical i’m gonna be soon out of a job 😅

11 months ago 1 0 0 0

I've silenced all notifications on all my devices and it's truly the best thing ever

...I am considering allowing calendar notifications tho cuz I almost missed 3 meetings already 😅

11 months ago 25 0 2 0

two weeks traveling and I miss my mechanical keyboard so much

11 months ago 4 0 1 0

MANGO SMOOTHIE

don’t forget da smoothie 🤤

11 months ago 2 0 1 0
Post image

when someone says they wanna bring me to their favorite italian restaurant

11 months ago 16 0 0 0

I am still perpetually in awe that skill emergence exists in language models

million of caveats but we have models that pick up capabilities from plain text???

it's so magical, I can't believe we got such treat

1 year ago 23 2 0 0

congrats!!! amazing news 🥰

1 year ago 4 0 0 0
Advertisement

mittens!

1 year ago 2 0 0 0

PAWS!

1 year ago 1 0 0 0
Post image

bluesky deserves to know we’ve adopted a cat and he’s the most handsome boy

1 year ago 62 0 2 0

Summary of our recommendation we submitted to White House to ensure success of open & transparent AI

As a meta point, I’m very grateful to be in a position where I can put my technical expertise in the service of policy needs 🥰

1 year ago 7 0 0 0

with so many good vision models out there, choosing a starting model with a friendlier license is better ☺️

Gemma is a great generalist model, but any competent VLM model is a good starting point for olmOCR

1 year ago 2 0 1 0