Seth Kimmel (@sethkim.me) Bsky

Really great collaborating with @nathanlile.bsky.social!

Reach out if you're working on synthetic data generation, offline RL, or simulating agentic behavior.

9 months ago 1 0 0 0

Had a lot of fun with this one! Turns out that >1.5% of posts on Hacker News are aviation-related in recent years. www.skysight.inc/blog/hacker-...

1 year ago 2 0 0 0

Model Security with Large-Scale Inference _(Adapted from our talk at the Modal x Mistral Demo Night in San Francisco on March 6th, 2025)_

Gave a fun talk at the @modal-labs.bsky.social and Mistral AI
demo night last week in SF! We discussed open-source model security, applying some of the large-scale inference techniques we've been working on recently.

www.skysight.inc/blog/model-s...

1 year ago 0 0 0 0

a man with a beard and a scarf around his head says lisan al gaib on the bottom ALT: a man with a beard and a scarf around his head says lisan al gaib on the bottom

1 year ago 9 0 0 0

Do they have to nerd snipe us this bad?

1 year ago 6 0 0 0

show us the way

1 year ago 7 0 1 0

I've always wanted to like python notebooks

1 year ago 4 0 1 0

🥵 #dataBS

1 year ago 5 0 0 0

Thank you @jakthom.bsky.social!

1 year ago 0 0 0 0

Anyone want to make #graphBS a thing?

1 year ago 1 0 0 0

Depressing will be the day a developer asks me:

"What is StackOverflow?"

1 year ago 1 0 1 0

It seems pretty clear to me that this thing will be as big/bigger than X in terms of sheer number of users

1 year ago 1 0 0 0

Do you think this is different than human confidence? I'd say 99.9% of what we think is true is second-hand knowledge

1 year ago 0 0 1 0

Using logprobs | OpenAI Cookbook Open-source examples and guides for building with the OpenAI API. Browse a collection of snippets, advanced techniques and walkthroughs. Share your own examples and guides.

Somewhat disagree here. Have you ever looked at logprobs? The model far prefers steering in directions that it feels confident in given alternatives. cookbook.openai.com/examples/usi...

1 year ago 0 0 1 0

So do humans! It's why we have QA/testing, and jobs that are just pure oversight.

You might expect both an LLM and a human to get a handful of data labeling tasks wrong, but have it checked with a verifier/adversarial LLM and you'll likely get ~100% accuracy.

1 year ago 0 0 0 0

That being said, my guess is progress will resume when we start to generate high-quality, focused synthetic data. Sort of like forcing a human to go to the library and acquire actual knowledge instead of scrolling on social media all day

1 year ago 1 0 0 0

How can we be surprised that LLM scaling laws don't hold when the training data is literally just crap people write on the internet?

1 year ago 1 0 0 1

Btw the bait is me just saying there is no debate

1 year ago 1 0 0 0

Okay now that everyone is here who wants to get baited into an R vs. Python debate?

1 year ago 0 0 0 1

Seems like the holiday is bringing huge amounts of new users over here. @jaz.bsky.social is your data supporting this?

1 year ago 0 0 0 0

Who is going to build the billion dollar slop/non-slop classifier?

1 year ago 2 0 0 0

While a lot of the content on X is clearly written by humans, it's sort of degraded into subhuman patterns of engagement. Lots of cryptic speak, baiting, trolling, inflaming, etc. Glad this place has actual humans posting actual human thoughts.

1 year ago 0 0 0 0

This is super cool! One has to assume that the most open, programmable, and hackable content platforms win in the long run

1 year ago 2 0 0 0

Becoming a more confident engineer isn't about writing less dumb code; it's about accepting the fact that everyone else's code is just as dumb as yours

1 year ago 0 1 0 0

GenAI - pay more for inefficient ML models
Crypto - pay more for others to verify your own transactions
SaaS - pay more for something a spreadsheet can do
Cloud - pay more for someone else’s computer
Mobile - pay more for apps that can run in a browser

1 year ago 1 0 0 0

I'm a big fan of the "anti" data warehouse approach! Users shouldn't be forced to store their data in a third-party system to get the benefits of its processing capabilities.

1 year ago 3 0 0 0

People forget that it's not unusual for Apple to release products that initially suck and are iteratively refined. They take big bets.

The original iPhone, Apple Maps, etc.

My guess is Apple Intelligence will have a dominant, frontier consumer AI product within ~5 years.

1 year ago 0 0 0 0

Startup idea: Cursor, but it just shits on your patterns and bullies you into refactoring your entire codebase every time you ask it a question.

1 year ago 5 0 0 0

4. The notion of consensus will be a lot more important, and agentic moderators might be in charge of modifying embedding indexes to more accurately represent reality and remove hallucinations/biases

1 year ago 0 0 0 0

3. APIs and internet-based services might not be as rigid. An LLM can more freely negotiate with a service provider if there request doesn't conform to a certain standard.

1 year ago 0 0 0 0

Posts by Seth Kimmel