aria (@aurelium.me) Bsky

as grim as it is I am kind of hoping for like. a full blown emotional breakdown in public

16 hours ago 0 0 0 0

i guess air travel is the thing that *would* make Newspaper People concerned but it honestly seems kind of like small potatoes given that we are in a state where the only question seems to be "is this going to be worse than COVID economically or slightly less bad"

23 hours ago 4 1 0 0

a graph from Weights&Biases showing two noisy curves on the same graph. the two lines are around the same but begin diverging near the end of the graph

I think I might be a bad scientist

I am in the nasty habit of running my experiment before my baseline, so whenever I start the baseline run I spend the whole day rooting for the gap between the grey line and the brown one to get wider

1 day ago 11 1 0 0

throughout The Sopranos I found it pretty notable that all of these guys are constantly getting shot at and going to jail for a pretty mediocre income

everyone below Tony seems to live a basically unremarkable middle class lifestyle and Tony himself manages only "dentist rich"

2 days ago 8 1 0 0

i mean there's a fundamental tradeoff. compute is not unlimited and, worse, you can only speed up training so much by adding more. mythos is a bet specifically that relatively modest RL on a much more capable base is worth it

2 days ago 1 0 0 0

ah, sorry, was looking at it backwards. point stands though

2 days ago 1 0 1 0

this entire thing, besides the weird nothing interviews, is also basically a retread of a series of Barely Sociable videos from 2020 which present all the same evidence

2 days ago 2 0 0 0

not that it really matters but honestly I'd bet that they never mined any part of the strait. much easier to just say you did

after all, however confident I am, I would not bet my life on it

2 days ago 9 0 1 0

source of this could be queuing or it could just be an abnormally high batch size for whatever preview the ran on to limit the size of the pre-release deployment

2 days ago 3 0 0 0

to be clear:

1. this is a "grapevine" kind of rumor so I don't have specifics
2. the way I have seen it phrased implies that they are talking about full response time

2 days ago 8 0 1 0

yeah, this is true, although it seems by comparing Ant's internal benchmark set to the actual ECI scores on previous models that their internal one is somewhat easier, so you can probably treat that value as an upper bound to its real ECI score

2 days ago 2 0 1 0

based on anthropic's general research direction/philsophy I think it's just a relatively-dense 10T-20T trained similarly to Opus 4.6

according to the system card it actually uses fewer reasoning tokens on average than Opus 4.6

2 days ago 4 0 1 0

I don't think it's like that, I think it's just a very slow model with CoT reasoning

2 days ago 7 0 1 0

if the rumor is correct it appears to be functionally equivalent to the GPT-5.x Pro series, although scaled on the axis of model size rather than reasoning depth

outside of SWE it ranks similarly to GPT-5.4 Pro on general benchmark sets

2 days ago 14 0 4 0

you'd notice because it would take 15-25 minutes to respond, reportedly

2 days ago 50 0 3 1

fwiw, besides the obvious SWE-maxxing RL, it seems the actual baseline intelligence capacity of Claude Mythos is not that much higher than GPT-5.4 Pro (probably <3T), it's just that they don't really evaluate the Pro models on agentic stuff because that's unhinged

3 days ago 4 0 0 0

my understanding is that if there were an exact copy of the solar system 10ly away we would have to get pretty lucky with angles to even notice Earth was there

our technology makes our sampling of exoplanets pretty biased against planets like Earth around sun-like stars, inconveniently

3 days ago 4 0 0 0

worth noting that a lot of thinking on the Fermi paradox assumed the intensity of our signal cast into space would increase over time

it has decreased iirc. even peak Earth radio noise would not be detectable to someone with 2026 tech from 10ly - dubious if Earth itself would even be noticeable

3 days ago 5 0 1 0

prior to the upcoming energy crisis the biggest recent rise in cost of living for urban America can be attributed to an decades-long ideologically-driven freeze on housing construction

Oops, All Ideology!

3 days ago 3 0 0 0

hard to buy into the profit maximizer narrative when the last few years have been characterized by a fit of ideological madness among the US elite running completely counter to their financial interests

3 days ago 5 0 1 0

many events competing for the headline today but we all know the real groundbreaking news is Minecraft switching to Vulkan

4 days ago 12 0 1 0

not excited for the "urban americans aren't real americans" consensus to fuck me over by focusing on individual commutes instead of logistics, which is by far my main exposure to this besides non-fuel petrochem

4 days ago 3 0 0 0

they will not be releasing mythos-preview publicly

4 days ago 5 0 1 0

see I only know of like two people meeting this description. is it more common than that

4 days ago 8 1 1 0

competition math**

4 days ago 12 0 0 0

I'll be the first to admit: not uniformly, yet. comparable/better for a lot of general agentic tasks and competition but polishing our SWE post-training stack is one of our big focuses for the next model we release

base model is 100% capable of it imo, it's just a matter of data and environments

4 days ago 18 0 1 0

I do like running my own AI at home, but it's worth acknowledging that my box pulls >100W idle, fights my air conditioner, and has a dismal PUE of probably <~30%, compared to >90% for a datacenter. you should use datacenters more and self-hosting less if you care about energy usage.

4 days ago 101 7 6 3

i have worked here for 9 months and don't actually know if this is true

4 days ago 24 1 1 0

in my unbiased opinion arcee is downright great

4 days ago 26 0 2 0

everyone quoting this with approval is either named something like "fizzy the tiefling prince 🌈 acab blm 🖌️ COMMS OPEN" or "anticiv anarcho-revanchist 🔻🪂" and they're all the most unbearable kind of person you will ever meet

4 days ago 96 14 5 1

Posts by aria