Advertisement · 728 × 90

Posts by Jeff Smith

Post image

But I fear I'm far too old and slow to be this guy. I don't even *like* Red Bull.

4 days ago 0 0 0 0
Post image

I really am better suited to be this guy. I have the sweater vest and everything.

4 days ago 0 0 1 0

So, you don't feel like Ludwig when you're done. You just feel like you played as fast as you could but the game can still be played so much faster and there's probably a teenager in Korea who could mop the floor with you.

4 days ago 0 0 1 0

But with agentic coding, basically being a dev is just Starcraft with shittier graphics. You barely even notice all of the wins, because the model is solving all of the puzzles. You just wind up typing as fast as you can. It's pure video games.

4 days ago 0 0 1 0

Back in the day, being a dev used to be like being Ludwig from that David Mitchell show. You should just pace and whiteboard and ponder until you solved the puzzle. And you felt really smart when you did. It was a big part of the appeal for a lot of us.

4 days ago 1 0 1 0
The basic idea of hash tables is that “the universe is a big place,
but it’s mostly empty."

The basic idea of hash tables is that “the universe is a big place, but it’s mostly empty."

Hash tables, or how to leverage the sinking feeling of loneliness you get when you look at the sky

5 days ago 86 11 1 0

The new “This could have been an email” is when someone does interesting research and the only way to consume it, is on YouTube

“This could have been a blog post!”

1 week ago 15 3 2 0
Advertisement

Next #ATmosphereConf should be held in Europe

1 week ago 18 1 1 0
Preview
Against the dark forest The complex of ideas I’m going to call the Dark Internet Forest emerges from mostly insidery tech thinking, but from multiple directions.

Required reading for everyone following @kissane.myatproto.social’s awesome #AtmosphereConf keynote.

1 week ago 58 14 0 6

Loved every moment of @kissane.myatproto.social keynote at #ATMosphereConf .

I need HOLD FAST merch stat. Sweatshirt, totes, and best of all: gloves.
#KelpFacts

1 week ago 6 0 0 0

Which is why, although all John Luther Adams music will put you to sleep, all John Adams music will disturb your rest forever.

1 week ago 0 0 0 0

John Adams music is like if you put a car factory on the back of a semi. And then the driver had intrusive thoughts.

1 week ago 0 0 1 0
Preview
The toughest AI benchmark just got a whole lot tougher ARC-AGI-3 is the latest version of a clever benchmark that challenges AI models to solve mini video games with no written instructions....

For the ARC-AGI-3 benchmark test, the developers made interactive puzzle games sherwood.news/tech/the-tou...

1 week ago 6 2 1 0
Preview
Online bot traffic will exceed human traffic by 2027, Cloudflare CEO says | TechCrunch AI bots may outnumber humans online by 2027, says Cloudflare CEO Matthew Prince, as generative AI agents dramatically increase web traffic and infrastructure demands.

Online bot traffic will exceed human traffic by 2027, Cloudflare CEO says

2 weeks ago 6 2 1 0

Claude is soooo slowwwwwwww when America is awake

go back to sleep, y'all

1 week ago 7 1 0 0
Research Scientist, Reinforcement Learning London, UK

DeepMind's RL team is hiring a research scientist: if you're passionate about RL, come work with us!

And if you know people who might be interested, please share:
job-boards.greenhouse.io/deepmind/job...

2 weeks ago 28 14 1 0
Post image

@martin.kleppmann.com talking to @qconferences.com London about @bsky.app and #ATproto .

2 weeks ago 6 1 0 0
Advertisement

I'll be at #QCon London tomorrow talking about this. Come find me if you're working on open source review tooling or contributor trust. #oss #genAI #codingAI

2 weeks ago 1 0 0 0

We're also working on the cold-start problem. Scoring new contributors LOW is accurate but not useful. The next step is tooling that helps first-time contributors understand a project's expectations before they submit.

2 weeks ago 0 0 1 0

Where we're headed: contributor scoring tells you who someone is. The harder question is whether a specific PR fits the target repo. We've seen strong signal in repo-specific fingerprinting and we're building tools around it.

2 weeks ago 0 0 1 0
Preview
A Basket of Eggs Revisiting How We Score Open Source Contributors

Full writeup with methodology and data: neotenyai.substack.com/p/a-basket-o...

2 weeks ago 0 0 1 0

We also pulled account age out of the score and into a separate advisory. The score now means one thing. Account age is context alongside it, not blended in.

2 weeks ago 0 0 1 0

New default: one ratio. Directly interpretable. If a contributor has a 78% merge rate, that's the score. No graph construction, no regression coefficients.

2 weeks ago 0 0 1 0

That pushed us to question the scoring model. The graph score (the most complex part of the system) actively hurt predictions for unknown contributors. Merge rate alone outperforms the full model at every tier.

2 weeks ago 0 0 1 0

We tried to detect suspended GitHub accounts from behavioral signals. LLMs, network analysis, title patterns. None of it worked on contributors who'd gotten code through review. They look like everyone else. The merge process itself is the filter.

2 weeks ago 0 0 1 0
Advertisement

We rebuilt Good Egg's scoring model from the ground up. Simpler, faster, more accurate for the contributors who actually need scoring. Here's what we learned and where we're headed. 🧵

2 weeks ago 0 0 1 0
Preview
GitHub - 2ndSetAI/good-egg: Trust scoring for GitHub PR authors using graph-based ranking on contribution graphs Trust scoring for GitHub PR authors using graph-based ranking on contribution graphs - 2ndSetAI/good-egg

Full methodology, all scoring data, and the failures are published alongside the successes. github.com/2ndSetAI/goo...

1 month ago 0 0 0 0

v1 and v2 have identical AUC (0.647). We shipped v2 anyway because merge rate corrects survivorship bias and account age stabilizes sparse graphs. Both carry confirmed statistical signal. The flat AUC just means the graph already captures most ranking information.

1 month ago 0 0 1 0

We tested seven features on 5,129 PRs across 49 repos. Three survived. Most interesting failure: text similarity between PR descriptions and project READMEs. Higher similarity predicted lower merge rates. We think low-effort PRs parrot project language.

1 month ago 0 0 1 0

The case that motivated it: Guillermo Rauch scores MEDIUM against his own company's Next.js repo. Zero merged PRs in Next.js itself. v2 factors in his 17.7-year account and 78% merge rate, pushing him to HIGH.

1 month ago 0 0 1 0