So is BPE (for NLP). It's always crazy to me that these came out at roughly the same time.
aclanthology.org/P16-1162/
Posts by Marco
Just needs prediction markets and it will be product of the year.
Like if there are a few similar moves that all look good (no hanging pieces, equal trades, etc) but one leads to a piece loss a few moves later. These are obviously great calculation and visualization puzzles, but are pretty hard to get correct and are frustrating for people using the app casually.
The primary thing I'm having trouble with is positions where the blunder doesn't become apparent until 4-5 moves later. These are not good for a single-turn daily puzzle app.
blunder.clinic #42 • 1200
🟩🟩🟨⬜🟥🟥
5/6 💪
blunder.clinic #42 • 1500
🟩⬜🟨🟨⬜🟥
4/6 💪
blunder.clinic #42 • 1800
⬜🟩🟨🟨🟥⬜
4/6 💪
I built an interactive guide on how Shazam (the music identification app) works!
This is the next installment in my newly coined "How The Heck?" series, where we explore everyday tech that can feel like magic (QR Codes, GPS, and now this one).
Hope you enjoy it!
perthirtysix.com/how-the-heck...
Stochastic tokenization is one of my favorite topics. Here is a recent preprint on arXiv.
h/t @trappmartin.bsky.social
I watched a video where they trained a new barista by giving her a crash course and then making her make like 100 lattes in a row. This is the dream, but I don't know that many people in the bay.
Making 1-2 per day just isn't enough for consistent improvement.
Maybe there is a lesson in this.
I have been wanting to host an "apartment cafe" for a while, but I need to get a bit more consistent with my latte art. So I held a dry run for a few people this weekend so I could just make a bunch of lattes in a row without risking my health.
Here are some of my better ones.
Oh, it's actually not his last name but his given name:
현덕 = 賢德
His last name is 송 = 宋.
TIL! This is a cool name for sure. Thanks!
Hmm, I can't say that "virtuous" and "protoss player" go together in my head!
A thought occurred to me last night that Lee Sedol (the Go player) is about as close to Korean nominative determinism as I've ever heard.
이세돌 ≈ 二三石, if you squint a bit.
A screenshot of a white-on-black terminal depicting a 19x19 go board in ascii graphics, with empty grid intersections as periods, and black and white as Os and #s
It’s absolutely incredible that one of the largest Japanese-run Go servers, which has been running since 1992, is still accessed entirely via Telnet. And while most players use GUI clients that use Telnet under the hood, you can still connect manually and get ASCII graphics streamed to you
One of these days I'll get 6/6 x 3
blunder.clinic #41 • 1200
🟩🟩🟨🟨🟥⬜
5/6 💪
blunder.clinic #41 • 1500
🟩🟩⬜🟨🟥⬜
4/6 💪
blunder.clinic #41 • 1800
🟩🟩⬜🟨⬜⬜
3/6 💪
#chess
Missed a lot of greens today 😬
blunder.clinic #40 • 1200
🟩🟩🟨🟨⬜🟥
5/6 💪
blunder.clinic #40 • 1500
⬜⬜🟨🟨🟥🟥
4/6 💪
blunder.clinic #40 • 1800
⬜🟩🟨🟨🟥⬜
4/6 💪
Cool profile of @metr.org’s work in the NYT today! Particularly like this from my colleague Ajeya: “METR is an organization that asks... what we think would be most valuable for the world to know about A.I. and its risks, and then the answers are what they are.”
www.nytimes.com/2026/04/17/t....
measured Opus 4.7's new tokenizer today. English: 1.45× more tokens than 4.6. Cyrillic: 1.00×.
the optimization play is clear: write your codebase in Russian-transliterated keywords.
функция факториал(н):
free 30% token discount, unreadable to English speakers, huge productivity win
This is great idea haha. Now how can it be adapted to English classes?
Are you a PhD student interested in game AI? Submit to AIIDE 2026’s Doctoral Consortium! The feedback a DC provides helps sharpen your dissertation focus while informing you about potential career options.
Applications are due July 25, 2026! Read more at: tinyurl.com/aiide26dc
This is a good point, cause treating whitespace as its own token would fix this problem.
Token healing partially solves the problem by popping the last token of your prompt preamble and then ensuring that the next token the model produces matches the surface form of the popped token + the beginning of your prompt.
Ah yeah this still happens in a lot of prompting things + constrained generation. The "partial token problem" and "token healing" are the key words.
Here's a great paper about it.
Here is one reason you might want to treat whitespace separately: words w/ and w/o a leading whitespace get tokenized differently.
Having a separate whitespace token would unify a lot of these.
I feel like I read an xkcd where they place a computer inside of Narnia to speed up computation in our world, but now I can't find it.
There is a Narnia comic with a computer, but I differs right at the end.