here's another bsky.app/profile/doll...
Posts by Milan Weibel ๐ท
here's one bsky.app/profile/segy...
but then again the 3 month figure is just an average and current narrowing could be fluctuation rather than trend
i love the phrase "inside traders announce"
very roughly gpt5.4 was released 46 days ago and this new kimi matches it so that's the current gap and according to epoch the gap has historically been 3 months so yes things are narrower than in the period up to october last year
epoch.ai/data-insight...
actually now that i thought about it more i'm not sure it's narrowing. has been pretty narrow for a while.
the open-closed gap on generally-available model performance is narrowing
frontier labs risk being undercut in opus-class until they make mythos-class GA
yes
in principle you could have a model so aligned that it successfully misalignment-fakes any attempt to misalign it
...unless the misalignment finetuning is sft
FT 2 days ago: "Amodei says he suspects open-source models and Chinese developers will be able to replicate Mythos's capabilities within six to 12 months."
but mythos isn't available to be distilled
hmm
mathematicians as stochastic parrots?
lots of polemics i don't agree with and some less than convincing arguments in linked post, but the โ to ; replacement in the dune quote piece of evidence is a slam dunk
"the machines are fine. i'm worried about us" was written by claude
im 99.5% sure that account is run by a human
entropybro coded tho
prompt for a demo from the GPT2 announcement. for some reason it's seared into my mind.
โIn a shocking finding, scientists discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountainsโฆโ
track record on that? afaict crypto is a bit less prominent now than at its peak but still around
compute OSINT
measuring datacenter by the GW is so metal
user @birddroneone.bsky.social says AI is useful for coding in a thread full of negationists
user @birddroneone.bsky.social posts: I proudly earned my way onto a bunch of idiots โAI boosterโ block lists despite the fact that I hate AI and any real review of my posting history makes that clear. My crime? Acknowledging that all commercial software development today uses AI. I wish it wasnโt true. But it is.
love to see epistemic integrity like this
where is this survey from? an uni thing?
oh lmao
sure, there's a tradeoff
why do you expect that philosophy you tested to be outside its training data?
pure speculation but what if this is connected to 4.7 being a new pretrain? maybe pelican drawing is developmentally later as a postraining effect than coding
my only complaint with opus 4.7 rollout is anthropic setting default effort to xhigh
nine opus instances working together for days did really well at weak-to-strong supervision on qwen
what about false positives though?
also worth mentioning that they have had 9 presidents in the last 10 years
there were other elections today, in peru: 36 presidential candidates on the ballot
voting was extended to tomorrow due to logistical problems in some polling places