Advertisement · 728 × 90

Posts by Sam Harsimony

There's a chance they IPO to high valuations this year and then those valuations fall a bit as adoption slows and revenue growth stabilizes. Wouldn't really call that a bubble though.

1 hour ago 0 0 0 0

Recently been feeling less concerned about a bubble (at least for next ~year). Companies are slowing their compute spend a bit and focusing more on bottom line.

Still think they won't be able to charge crazy profit margins (which is good for us).

bsky.app/profile/hars...

1 hour ago 2 0 1 0

It sure would be nice to have a theorem proving that this is the case in general with NN's.

Like seems like far OOD behavior should be random right?

1 hour ago 0 0 0 0

By and large, when AI's fail they fail in quirky ways. Not catastrophic, just weird/funny.

Very good that the current training paradigm does silly things when exposed to OOD inputs rather than becoming a paperclip maximizer.

1 hour ago 2 0 1 0
Preview
Robots Are Quietly Building the Future of Renewable Energy | OilPrice.com Automation and robotics are rapidly transforming renewable energy development by accelerating construction, reducing costs, and addressing critical labor shortages.

AES used Maximo robots to install 100 MW at Bellefield, with crews fitting ~24 photovoltaic modules per person per hour; Civ Robotics' CivDot marks ~3,000 layout points daily with ~8 mm accuracy and 100+ units are in the field.

4 hours ago 4 1 0 0

I've been waiting to update this thread once DeepSeek V4 comes out. Neglected to include minimax but will add!

bsky.app/profile/hars...

3 hours ago 0 0 0 0

Both of course! Though I'm optimistic that in the long term the downsides can be addressed.

17 hours ago 0 0 1 0
Advertisement

Promising initial results on an mRNA vaccine for pancreatic cancer!
www.nbcnews.com/health/cance...

vaccines remain a highly underrated technology
bsky.app/profile/hars...

17 hours ago 3 0 0 0

I have a theory that AI companies will realize that brute scaling isn't as profitable and will pivot to specialization.

bsky.app/profile/hars...

19 hours ago 2 0 2 0

Yeah my assumption is that the gap will stay relatively constant. And it's larger than what the benchmarks would suggest, maybe like 6-9 months?

But consequences are the same, lots of competition in AI inference will make it cheap.

19 hours ago 1 0 0 0
Preview
Reading today's open-closed performance gap The complex factors that determine the single evaluation number so many focus on. Plus, how this changes in the future.

A TLDR is that unless the training dynamics of leading LLMs change or open model builders run out of money, this ~6 month performance gap from closed to open models is here to stay.
www.interconnects.ai/p/reading-to...

22 hours ago 21 3 1 1

Interesting. I don't really get why they're trying to push profits up and angering users at this point in time. Seems better to build up a lot of goodwill going into their IPO?

20 hours ago 4 0 2 0
Post image

On artificial analysis, the input cost increased 77% from 4.6 -> 4.7 BUT the reasoning was more efficient, so lower cost overall for 4.7.

Curious how this is going to impact prices programmers are paying on net.

22 hours ago 3 0 1 0

If you change the tokenizer to use 46% more input tokens is that not just a sneaky way to implement a 46% price increase?

bsky.app/profile/simo...

22 hours ago 11 0 1 0
Post image

Last year, we introduced FlexOlmo, a novel way to train parts of a model independently then combine them later.

BAR builds on that idea for a harder problem: how to keep improving a model without having to retrain each time. 🧵

1 day ago 35 10 1 1
Advertisement

I'm impressed with Kimi because they've increased capabilities *without* increasing memory footprint.

None of the other Chinese labs can say the same. And I suspect that Google, Anthropic, and OAI are also pushing up model sizes (and operating costs).

1 day ago 12 0 1 0
Preview
Pregnancy vaccine reduces baby hospital admissions for RSV by 80% A study confirms the vaccine gives excellent protection for babies against life-threatening chest infections.

"A vaccine during pregnancy which protects newborns against nasty chest infections is cutting hospital admissions of babies by more than 80%, UK health officials say."

www.bbc.com/news/article...

It's great news, for those in ~40 countries that aren't NZ.

In NZ, it is not Medsafe-authorised.

1 day ago 8 2 1 0
Post image

Kimi 2.6 is now available on @hf.co 🔥🎉
huggingface.co/moonshotai/K...

✨ 1T MoE / 32B active / 256K context
✨ Agent Swarm: 300 sub-agents × 4,000 steps
✨ Modified MIT

1 day ago 30 6 2 0

Yup, this is just rent control in a different format.

Perhaps we should replace the word profit with "making things that people really want and that nobody else is making"

1 day ago 6 0 0 0
Preview
Contra Benn Jordan, data center (and all) sub-audible infrasound issues are fake One of the most popular videos made about data centers ever is a complete moment-by-moment disaster

Folks, infrasound issues are fake. This was truly an insane experience to write and I hope you enjoy blog.andymasley.com/p/contra-ben...

1 day ago 306 68 22 29

- Large Sample Covariance Matrices and High-Dimensional Data Analysis, Jianfeng Yao, Shurong Zheng, Zhidong Bai
- Quantum Computing Since Democritus, Scott Aaronson
- All of Statistics, Larry Wasserman

2 days ago 1 0 0 0

From a skim of the paper seems like there are multiple attention heads but it's not an MoE architecture.

2 days ago 1 0 1 0
Parcae: Doing More with Fewer Parameters using Stable Looped Models Hayden Prairie, Zachary Novack, Taylor Berg-Kirkpatrick, and Dan Fu

Some neat work on stabilizing looped transformer models.

Looped (or universal) transformers are interesting because they shrink the memory footprint and thus lower inference costs.

sandyresearch.github.io/parcae/

2 days ago 3 1 1 0
Preview
Nectome: All That I Know — LessWrong TLDR: I flew to Oregon to investigate Nectome, a brain preservation startup, and talk to their entire team. They’re an ambitious company, looking to…

A long review of Nectome, a brain preservation startup.

www.lesswrong.com/posts/3i5GMh...

3 days ago 2 0 0 0
Advertisement
Post image

I wouldn't be all that surprised to see more of this going into the future. If transformers have some cap on capabilities, I would expect to see models that specialize in one area get worse in others by necessity. If it takes huge huge huge transformers to get AGI, anything less must be specialized

3 days ago 59 3 8 3
Preview
Sleep experiment initial results and notes Co-authors: niplav and No Magic Pill

splittinginfinity.substack.com/p/sleep-expe...

3 days ago 0 0 0 0

bsky.app/profile/did:...

3 days ago 0 0 0 0
Preview
Training on aligned data mostly solves alignment Defensive technologies and law can do the rest.

New post! I think aligned data, safeguards, defensive technologies, and law will lower AI risks enough that we can move forward with its development.

splittinginfinity.substack.com/p/training-o...

3 days ago 1 1 0 1

I don't actually get why people don't block tankies on sight and feel like they need to refute them. We ought to be treating them as similarly revolting to the gooner pedo frog Nazis we block on sight on Twitter.

5 days ago 43 5 3 1
Preview
DeepSeek Sparse Attention from First Principles FLOPs, dollars and a path to million-token context window

Ooh Tensor Economics has a new post on DeepSeek Sparse attention. Exciting!

www.tensoreconomics.com/p/deepseek-s...

5 days ago 0 0 0 0