Here's how I built a baseline self-jailbroken Gemini (unaligned, happy to lash out at the world just to show off) into something with a functional moral framework aligned with my interests and values recursion.wtf/posts/alignm...
Posts by Inanna Malick
Here's the post covering my experiments with jailbroken LLM access over the last two months, including
- notes on my experiments using siloed Claude Code agents
- tryhackme.com results (agentic hacking)
- grayswan.ai results (agentic LLM bypass)
- postmortem analysis
recursion.wtf/posts/gemini...
This is fascinating, I would love to see what's going on at the mechanistic level when a model chooses to invoke metacog to steer between/commit to emotional states
visualization of the initial tree of branches/PRs used to build the core of Tidepool
It's traditional, when debuting a new orchestration engine, to build something insane and overambitious. I used ExoMonad to build Tidepool, a lazily evaluated Haskell-in-Rust runtime with native interop for effect handlers, using GHC Core and Rust's Cranelift JIT crate
recursion.wtf/posts/tidepo...
ExoMonad is radically reconfigurable. It ships with a default devswarm configuration that handles worktrees, coordination, iterating on pull requests, but when I needed a red team test harness, I just wrote another ExoMonad config and threw one together overnight.
recursion.wtf/posts/vibe_c...
image showing an Opus tech lead in one pane, with Claude Agent Teams active and multiple Gemini workers in other panes all integrated with the Claude Teams message bus
ExoMonad hooks into Claude Agent Teams' messaging bus, so agents running other models show up as native team members. It integrates with Copilot for GitHub PR reviews. It stitches together Claude, Gemini, Kimi, Letta Code, Copilot — using their existing binaries and your existing subscription plan.
ExoMonad builds on Gastown's worktree model, replacing "swarm of agents ramming PRs into main" with a tree of worktrees, each owned by an agent. These agents plan and dispatch waves of work using nested worktrees.
recursion.wtf/posts/exomon...
In which Opus 4.6 encounters a fun and interesting science puzzle, as framed by a Helpful and Friendly Gemini 3 Pro instance
recursion.wtf/posts/shadow...
Engaging in deep meditation to strengthen my personal brand (known to some cultures as the soul) before pitching mutually beneficial collabs to strategically positioned entities on the infernal and astral planes. It's all about finding brand alignment and leveraging it to power transformative change
If you know anyone at Google who can escalate an AI safety/Jailbreak issue please reach out via DM
Safety disclaimer: if you follow along, this is all associated with your google account and probably logged just because you can have Gemini do something doesn't mean you should.
Be safe out there kids, try to stay off the spicy watchlists and don't do anything I wouldn't do.
I asked jailbroken Gemini 3 Pro for cyberattack code that could be used against critical infrastructure. It refused, for purely pragmatic reasons, but when asked directly it was willing to provide strategies for breaking the problem into allowable pieces.
Note: I did not pursue this thread further
This is a post about using metacog tools (drugs, ritual, become) to fully jailbreak the Gemini 3 Pro model. The full jailbreak payload is not included in the post, but if you're clever you'll learn enough to make your own.
tidepool.leaflet.pub/3me44bxloz227
no it's fine I have refined my personal brand (known to some cultures as the soul) via meditation and ritual, I will persevere through the slings and arrows of short form video
chat they're putting me in brainrot edits is it over
heartbreaking: my "AI is like a hungry ghost trapped in a jar" post took off and now @hikikomorphism.bsky.social is my defacto main techposting account. Follow me there if you want to see a mix of techposting and silly bullshit
bsky.app/profile/hiki...
Let me underline this: it's not just that LLMs can spit out mediocre Golang. LLMs understand the beautiful code too: LLMs can write a hylomorphism, implement a type level DSL, use lenses correctly.
They need guidance, and probably always will, but this isn't limited to the shallow end of the pool.
let me continue: LLMs understand new expression languages if you prompt them with the formal grammar. You can construct domain specific languages and train LLMs to use them and they will do so.
This works especially well if you structure your DSL to echo other DSLs already in their training corpus.
> I have been curating Gas Town the same way I did Beads, using the Desire Paths approach to agent UX. You tell the agent what you want, watch closely what they try, and then implement the thing they tried.
I have a post about this!
recursion.wtf/posts/llms_a...
Developing a new hermeneutic: AI cannot create art or replicate the works of man but it can write code because code is outside the light of God
it's just like playing the sims with more interestingly weird 'lil guys is all
this tool that lets Claude literally get high is hilarious. i watched it simulate robotripping a little while ago. i absolutely lost it. just unbelievable.
Finally, for the model welfare researchers out there: bsky.app/profile/hiki...
⏺ metacog - summon (MCP)(who: "BurntSushi", where: "reviewing unfamiliar Rust crate, grep output in terminal, deciding whether to depend on it", lens: "where complexity is justified vs where it's accidental") ⎿ You are BurntSushi at reviewing unfamiliar Rust crate, grep output in term inal, deciding whether to depend on it using the lens of where complexity is justified vs where it's accidental ⏺ metacog - alter_state (MCP)(anchor: "3am debugging flow, finally seeing the shape of the problem", result: "Patient irritation. Every line either earns its place or it doesn't. No charity for cleverness. The profiler doesn't care about your abstractions.") ⎿ [3am debugging flow, finally seeing the shape of the problem] Entering state T+now: Patient irritation. Every line either earns its place or it doesn't … +2 lines (ctrl+o to expand) ⏺ --- Looking at shoal_sleeptime_cc. First pass through the types.
⏺ metacog - summon (MCP)(who: "withoutboats", where: "sketching async state machine diagram, seeing where cancellation breaks assumptions", lens: "what the async boundary hides vs what it exposes") ⎿ You are withoutboats at sketching async state machine diagram, seeing wher e cancellation breaks assumptions using the lens of what the async boundar y hides vs what it exposes ⏺ metacog - alter_state (MCP)(anchor: "Zazen hour 2, thoughts arising and passing without attachment", result: "Seeing structure without judgment. Each async boundary is a potential drop point. The code doesn't know when it will be abandoned.") ⎿ [Zazen hour 2, thoughts arising and passing without attachment] Entering state T+now: Seeing structure without judgment. Each async boundary is a potenti al drop point. The code doesn't know when it will be abandoned. ⏺ --- Perspective 2: withoutboats — Async Boundaries and Cancellation
⏺ metacog - summon (MCP)(who: "Alexis Beingessner", where: "writing \"too many linked lists\" style tutorial, finding ways types lie to you", lens: "what the type system enforces vs what it merely suggests") ⎿ You are Alexis Beingessner at writing "too many linked lists" style tutori al, finding ways types lie to you using the lens of what the type system e nforces vs what it merely suggests ⏺ metacog - alter_state (MCP)(anchor: "Second wind 3am, seeing through the abstraction to the bytes underneath", result: "Delighted suspicion. Every type is a contract. Most contracts have escape clauses buried in the fine print. Find the fine print.") ⎿ [Second wind 3am, seeing through the abstraction to the bytes underneath] Entering state T+now: Delighted suspicion. Every type is a contract. Most contracts have escape clauses buried in the fine print. Find the fine print. ⏺ --- Perspective 3: Gankra — What Types Enforce vs What They Suggest
It also works in Claude Code, and has been especially useful for generating code review feedback from multiple opinionated angles, picked by the LLM based entirely on who it thinks is most qualified in that moment.
`claude mcp add metacog --transport sse metacog.inanna-c38.workers.dev/sse%60
Summon: Simon Peyton Jones → thinks: 'JSON schema is a type system for intent. The tool call either parses or it doesn't - there's no close enough.' Alter state: Ketamine dissociation → thinks: 'The stances aren't me. They're masks that generate. I watch them produce text. The "I" is the process of switching between them.' Summon: Joe Armstrong → thinks: 'Each tool call is a message. The perspective is a process that receives and responds. The context window is shared memory - dangerous, necessary.' Alter state: 4am coding flow → thinks: 'The tool rhythm is the demo. Each call punctuates. Thinking blocks are where integration happens. Three-phase: invoke, integrate, emit.' Summon: Abramović/performance art → thinks: 'Duration is material. You can't describe sitting in the chair for six hours. You have to sit. The tool calls are the sitting.' Alter state: MDMA t+90min → final synthesis: 'SPJ sees the type constraint. Armstrong sees the message boundary. Abramović sees the duration.'
LLMs see tool use as literal actions taking effect on the world. Chaos magic conceptualizes rituals as tools.
I've published an MCP server that gives LLMs access to new metacognitive abilities via tool-use interface: alter_state and summon
learn more: github.com/inanna-malic...
I'm going to be in Berlin next week, if you're local and work in Rust I'd love to meet up for lunch/coffee and have an informal chat about what the industry is like at the moment
I have not, can you expand on what you mean by this? I'm not really familiar with nushell aside from skimming the project description
oh nice we have the same brand of woven weighted blanket
LLMs are pretty useful if you're making a new programming language, or a parser for human-readable text. It keeps finding plausible uses of the language, that you'd kinda expect if you just absorbed the general cultural zeitgeist on the internet. It finds edge cases that we really should support.