Getting LLMs to simulate “true” randomness or generate diverse outputs is surprisingly difficult. We found a simple prompting trick that solves this by having the model generate and manipulate a random string. To be presented at #ICLR2026 this week!
Blog: pub.sakana.ai/ssot
Posts by hardmaru
Can LLMs flip coins in their heads?
When prompted to "Flip a fair coin" 100 times, the heads to tails ratio drifts far from 50:50. LLMs can understand what the target probability should be, but generating outputs that faithfully follow a given distribution is a separate problem.
pub.sakana.ai/ssot
I am very proud of our team for releasing EDINET-Bench, and it is fantastic to see a Japanese financial dataset recognized at #ICLR2026 this week. We need more diverse, non-English datasets to evaluate models in the real world.
Paper: openreview.net/forum?id=Dxn...
Make neural network cells inside a “Digital Petri Dish” fight for control and dominance in a web browser tab.
Digital Ecosystems: Interactive Multi-Agent Neural Cellular Automata
pub.sakana.ai/digital-ecos...
We are hiring Software Engineers in Tokyo to help us scale Sakana AI’s R&D efforts. If you are interested in building the data pipelines and full stack infrastructure needed to push the boundaries of automated scientific discovery, we would love to hear from you. 🗼🎌
sakana.ai/careers/#sof...
Trained solely on recorded input and output traces, it successfully learned to render readable text and control a cursor, proving that a neural network can run as its own visual computing environment without a traditional operating system.
Blog: metauto.ai/neuralcomput...
Instead of interacting with a real operating system, these models can take in user actions like keystrokes and mouse clicks alongside previous screen pixels to predict and generate the next video frames.
A “Neural Computer” is built by adapting video generation architectures to train a World Model of an actual computer that can directly simulate a computer interface.
Paper: arxiv.org/abs/2604.06425
Code: github.com/metauto-ai/N...
Cool work led by Mingchen Zhuge et al. from Schmidhuber’s lab!
I truly believe AI will forever change the landscape of how scientific discoveries and scientific progress are made.
I’m incredibly proud of The AI Scientist team for this milestone publication in Nature. We started this project to explore if foundation models could execute the entire research lifecycle. Seeing this work validated at this level is a special moment.
Sakana AI 初の一般向けサービス Sakana Chat を公開しました🐟
Try Sakana Chat: chat.sakana.ai
強力なWeb検索エージェントを備え、高速で信頼性の高い情報を引き出せます。
世界の高性能なオープンモデルには、開発元のバイアスが不可避的に内在しています。我々は独自の事後学習により、①これらのバイアスの除去、②日本の価値観の反映、③安全かつ文脈に即した適応を実現する技術を開発しました。
今回のリリースは、その技術実証の第一弾。国内で誰もが安心して使えるAIの選択肢の一つとして、ぜひお試しください!
“When AI Discovers the Next Transformer”
Full Interview on YouTube: youtu.be/EInEmGaMRLc
Robert Lange (Sakana AI) joins Tim Scarfe (ML Street Talk) to discuss Shinka Evolve, a framework that combines LLMs with evolutionary algorithms to do open-ended program search.
Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation.
Blog: pub.sakana.ai/doc-to-lora/
「How Competition is Stifling AI Breakthroughs」
Sakana AI共同創業者 Llion Jones のTED AIトークが公開されました。目標を定めすぎないオープンエンドな研究がブレークスルーを生む理由、Transformerの成功が業界にもたらした状況、それを乗り越える次の構想と成果を語りました。
www.ted.com/talks/llion_...
Our journey at Sakana AI is just getting started.
We are looking for people to help us pioneer the next generation of AI—building from Japan to the world.
Join us: sakana.ai/careers
I founded Sakana AI after my time at Google, so it is incredibly meaningful to be able to partner with them now. It feels like a special connection to be working together again to advance the AI ecosystem in Japan.
sakana.ai/google#en
Our work on The AI Scientist and ALE-Agent has already shown the power of these models.
Now, we are scaling reliable AI in mission-critical sectors like finance and government to ensure the highest security and data sovereignty.
Full details: sakana.ai/google#en
We are thrilled to announce a strategic partnership with Google!
Google is also making a financial investment in Sakana AI to strengthen this collaboration. We are combining Google’s world-class products like Gemini and Gemma with our agile R&D to accelerate automated scientific discovery.
An Unofficial Guide to Prepare for a Research Position Application Authors: Stefania Druga, Luke Darlow, and Llion Jones Disclaimer: This guide is written by a few researchers at Sakana AI who have interviewed many candidates, and does not reflect the view of the entire organization. Each team may have their own preferences and styles for interviewing and finding the people that they can work closely with. This document, written by Stefania, Luke, and Llion, provides a glimpse into how some parts of our research org conduct interviews.
We just published an unofficial guide on what we look for when interviewing research candidates at Sakana AI.
Written by Stefania Druga, Luke Darlow, and Llion Jones.
The biggest differentiator? Understanding over implementation.
Read it: pub.sakana.ai/Unofficial_G...
RePo moves us toward models that intelligently curate their own working memory rather than passively accepting input order.
Read the full breakdown on our website:
pub.sakana.ai/repo/
Paper: arxiv.org/abs/2512.14391
Introducing RePo: Language Models with Context Re-Positioning
Standard LLMs force a rigid linear structure on context, treating physical proximity as relevance. Cognitive Load Theory suggests this is inefficient—models waste capacity managing noise instead of reasoning.
arxiv.org/abs/2512.14391
2026 is just getting started 🚀✨
We are hiring. Join our team in Tokyo!
sakana.ai/careers
AI導入は「雇用不安小さい正社員制度が強みに」
日経ビジネスにて、Sakana AI CEO @hardmaru.bsky.social
のインタビューが公開されました。企業へのAI実装が本格化する2026年における現状と課題、そして日本企業の組織文化がAI導入にとってポジティブに働く可能性について語りました。
business.nikkei.com/atcl/gen/19/...
【記事のハイライト】🧵
Reminded me of my older NeurIPS 2021 paper, where we removed the positional encoding entirely, and by doing so, an agent can process an arbitrarily long list of noisy, sensory inputs, in an arbitrary order.
I even made a fun browser demo to play with the agent back then: attentionneuron.github.io
Introducing DroPE: Extending Context by Dropping Positional Embeddings
We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.
arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE
One of my favorite findings: Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.
We found that if you simply delete them after pretraining and recalibrate for <1% of the original budget, you unlock massive context windows. Smarter, not harder.
We are taking our technology far beyond competitive programming to unlock a new era of AI-driven discovery.
We are hiring. Join our team in Tokyo.
sakana.ai/careers/#sof...
We’re hiring.
sakana.ai/careers/#sof...
When agents compete for limited resources, intelligence reorganizes around survival, not elegance.