hardmaru (@hardmaru) Bsky

Getting LLMs to simulate “true” randomness or generate diverse outputs is surprisingly difficult. We found a simple prompting trick that solves this by having the model generate and manipulate a random string. To be presented at #ICLR2026 this week!

Blog: pub.sakana.ai/ssot

1 day ago 30 6 1 1

Can LLMs flip coins in their heads?

When prompted to "Flip a fair coin" 100 times, the heads to tails ratio drifts far from 50:50. LLMs can understand what the target probability should be, but generating outputs that faithfully follow a given distribution is a separate problem.

pub.sakana.ai/ssot

1 day ago 12 3 3 1

I am very proud of our team for releasing EDINET-Bench, and it is fantastic to see a Japanese financial dataset recognized at #ICLR2026 this week. We need more diverse, non-English datasets to evaluate models in the real world.

Paper: openreview.net/forum?id=Dxn...

1 day ago 16 2 1 0

Make neural network cells inside a “Digital Petri Dish” fight for control and dominance in a web browser tab.

2 days ago 4 0 0 0

Digital Ecosystems: Interactive Multi-Agent Neural Cellular Automata

pub.sakana.ai/digital-ecos...

2 days ago 38 4 4 0

We are hiring Software Engineers in Tokyo to help us scale Sakana AI’s R&D efforts. If you are interested in building the data pipelines and full stack infrastructure needed to push the boundaries of automated scientific discovery, we would love to hear from you. 🗼🎌

sakana.ai/careers/#sof...

6 days ago 10 3 1 0

Trained solely on recorded input and output traces, it successfully learned to render readable text and control a cursor, proving that a neural network can run as its own visual computing environment without a traditional operating system.

Blog: metauto.ai/neuralcomput...

1 week ago 7 0 0 0

Instead of interacting with a real operating system, these models can take in user actions like keystrokes and mouse clicks alongside previous screen pixels to predict and generate the next video frames.

1 week ago 6 1 1 0

A “Neural Computer” is built by adapting video generation architectures to train a World Model of an actual computer that can directly simulate a computer interface.

Paper: arxiv.org/abs/2604.06425
Code: github.com/metauto-ai/N...

Cool work led by Mingchen Zhuge et al. from Schmidhuber’s lab!

1 week ago 76 13 2 1

I truly believe AI will forever change the landscape of how scientific discoveries and scientific progress are made.

3 weeks ago 3 0 0 0

I’m incredibly proud of The AI Scientist team for this milestone publication in Nature. We started this project to explore if foundation models could execute the entire research lifecycle. Seeing this work validated at this level is a special moment.

3 weeks ago 32 2 2 0

Sakana AI 初の一般向けサービス Sakana Chat を公開しました🐟

Try Sakana Chat: chat.sakana.ai

強力なWeb検索エージェントを備え、高速で信頼性の高い情報を引き出せます。

世界の高性能なオープンモデルには、開発元のバイアスが不可避的に内在しています。我々は独自の事後学習により、①これらのバイアスの除去、②日本の価値観の反映、③安全かつ文脈に即した適応を実現する技術を開発しました。

今回のリリースは、その技術実証の第一弾。国内で誰もが安心して使えるAIの選択肢の一つとして、ぜひお試しください！

4 weeks ago 11 1 1 0

“When AI Discovers the Next Transformer”

Full Interview on YouTube: youtu.be/EInEmGaMRLc

Robert Lange (Sakana AI) joins Tim Scarfe (ML Street Talk) to discuss Shinka Evolve, a framework that combines LLMs with evolutionary algorithms to do open-ended program search.

1 month ago 16 2 1 0

Instead of forcing models to hold everything in an active context window, we can use hypernetworks to instantly compile documents and tasks directly into the model's weights. A step towards giving language models durable memory and fast adaptation.

Blog: pub.sakana.ai/doc-to-lora/

1 month ago 103 14 2 4

How competition is stifling AI breakthroughs Llion Jones cowrote "Attention Is All You Need," the seminal paper that introduced the transformer — the architecture that launched the generative AI revolution. Now he warns that the industry that gr...

「How Competition is Stifling AI Breakthroughs」

Sakana AI共同創業者 Llion Jones のTED AIトークが公開されました。目標を定めすぎないオープンエンドな研究がブレークスルーを生む理由、Transformerの成功が業界にもたらした状況、それを乗り越える次の構想と成果を語りました。

www.ted.com/talks/llion_...

2 months ago 4 1 0 0

Our journey at Sakana AI is just getting started.

We are looking for people to help us pioneer the next generation of AI—building from Japan to the world.

Join us: sakana.ai/careers

2 months ago 29 3 0 0

I founded Sakana AI after my time at Google, so it is incredibly meaningful to be able to partner with them now. It feels like a special connection to be working together again to advance the AI ecosystem in Japan.

sakana.ai/google#en

2 months ago 49 2 3 0

Sakana AI Sakana AI、Googleとの戦略的パートナーシップ締結を発表

Our work on The AI Scientist and ALE-Agent has already shown the power of these models.

Now, we are scaling reliable AI in mission-critical sectors like finance and government to ensure the highest security and data sovereignty.

Full details: sakana.ai/google#en

2 months ago 1 1 1 0

We are thrilled to announce a strategic partnership with Google!

Google is also making a financial investment in Sakana AI to strengthen this collaboration. We are combining Google’s world-class products like Gemini and Gemma with our agile R&D to accelerate automated scientific discovery.

2 months ago 17 2 1 1

An Unofficial Guide to Prepare for a Research Position Application Authors: Stefania Druga, Luke Darlow, and Llion Jones Disclaimer: This guide is written by a few researchers at Sakana AI who have interviewed many candidates, and does not reflect the view of the entire organization. Each team may have their own preferences and styles for interviewing and finding the people that they can work closely with. This document, written by Stefania, Luke, and Llion, provides a glimpse into how some parts of our research org conduct interviews.

We just published an unofficial guide on what we look for when interviewing research candidates at Sakana AI.

Written by Stefania Druga, Luke Darlow, and Llion Jones.

The biggest differentiator? Understanding over implementation.

Read it: pub.sakana.ai/Unofficial_G...

3 months ago 23 1 1 1

RePo: Language Models with Context Re-Positioning In-context learning is fundamental to modern Large Language Models (LLMs); however, prevailing architectures impose a rigid and fixed contextual structure by assigning linear or constant positional in...

RePo moves us toward models that intelligently curate their own working memory rather than passively accepting input order.

Read the full breakdown on our website:
pub.sakana.ai/repo/

Paper: arxiv.org/abs/2512.14391

3 months ago 18 4 0 0

Introducing RePo: Language Models with Context Re-Positioning

Standard LLMs force a rigid linear structure on context, treating physical proximity as relevance. Cognitive Load Theory suggests this is inefficient—models waste capacity managing noise instead of reasoning.

arxiv.org/abs/2512.14391

3 months ago 56 8 1 4

2026 is just getting started 🚀✨

We are hiring. Join our team in Tokyo!

sakana.ai/careers

3 months ago 13 1 0 0

サカナAIのデビッド・ハCEO、AI導入は「雇用不安小さい正社員制度が強みに」 AIファーストを掲げる企業が増加する中、経営者はAIのリスクを正しく理解し、適切に導入を進める必要がある。国内最大級のユニコーンで、企業向けのAIソリューション開発を行うSakana AI（サカナAI、東京・港）のデビッド・ハ最高経営責任者（CEO）に、日本企業のAI導入における課題を聞いた。

AI導入は「雇用不安小さい正社員制度が強みに」

日経ビジネスにて、Sakana AI CEO @hardmaru.bsky.social
のインタビューが公開されました。企業へのAI実装が本格化する2026年における現状と課題、そして日本企業の組織文化がAI導入にとってポジティブに働く可能性について語りました。

business.nikkei.com/atcl/gen/19/...

【記事のハイライト】🧵

3 months ago 3 1 1 0

Reminded me of my older NeurIPS 2021 paper, where we removed the positional encoding entirely, and by doing so, an agent can process an arbitrarily long list of noisy, sensory inputs, in an arbitrary order.

I even made a fun browser demo to play with the agent back then: attentionneuron.github.io

3 months ago 29 3 0 0

Introducing DroPE: Extending Context by Dropping Positional Embeddings

We found embeddings like RoPE aid training but bottleneck long-sequence generalization. Our solution’s simple: treat them as a temporary training scaffold, not a permanent necessity.

arxiv.org/abs/2512.12167
pub.sakana.ai/DroPE

3 months ago 117 21 2 7

One of my favorite findings: Positional embeddings are just training wheels. They help convergence but hurt long-context generalization.

We found that if you simply delete them after pretraining and recalibrate for <1% of the original budget, you unlock massive context windows. Smarter, not harder.

3 months ago 219 33 8 1

We are taking our technology far beyond competitive programming to unlock a new era of AI-driven discovery.

We are hiring. Join our team in Tokyo.

sakana.ai/careers/#sof...

3 months ago 9 1 0 1

We’re hiring.

sakana.ai/careers/#sof...

3 months ago 28 7 0 0

When agents compete for limited resources, intelligence reorganizes around survival, not elegance.

3 months ago 18 3 1 1

Posts by hardmaru