thanks @teuber.bsky.social !
Posts by Orpheus Lummis
Montréal AI safety, ethics, and governance newsletter, April 2026
- INDU committee launches AI regulation study
- Mila researchers find LLM agents can infer CoT monitoring
- 20/25 AI researchers flag automating AI R&D as top risk
- Multiple events!
aisafetymontreal.org/newsletter/2...
IVADO (@ivado.bsky.social) offers two AI-safety-relevant upcoming workshops:
- Statistics in Trustworthy AI, May 11-15
- Uncertainty in AI, June 8-11
event.fourwaves.com/thematicseme...
New post: William MacAskill and Tom Davidson argue that AI character is a big deal.
Read it here: www.forethought.org/research/the...
Congrats!
Are there links the projects / repositories?
Announcing The Protopian Prize | Fiction Contest 🕊️
Write the story of humanity’s future...
The Protopian Prize is a fiction contest inviting you to share your vision of people working toward liberatory futures, meeting obstacles, & making real change.
protopianprize.com
UBI requires new tax revenue.
Korinek & Lockwood recently: As automation erodes labor income, consumption taxes take over. When AIs start reinvesting in their own expansion, tax that accumulation directly, balancing the rate between value for humans and growth.
www.brookings.edu/articles/pub...
Announcing the technical AI Governance Research (TAIGR) ICML workshop in July! Submissions (up to 8 pages) are due April 24. Co-submission with ICML and NeurIPS is encouraged.
taigr-workshop.com
This must-see new documentary is arriving in theatres this week. Through an honest and personal lens, Daniel Roher successfully highlights how each of us can move from passive observation to active contribution towards a more positive future with AI. www.youtube.com/watch?v=xkPb...
I‘m excited to present my work on Provably Safe Neural Network Control at @horizonomega.org‘s Guaranteed Safe AI online seminar on April 9th.
The talk will based on my NeurIPS‘24 paper with some updates on what I’ve been up to since :)
Feel free to join if you’re interested:
Guaranteed Safe AI Seminars, April 2026:
Provably Safe Neural Network Controllers via Differential Dynamic Logic
Samuel Teuber – PhD Candidate, Institute of Information Security and Dependability (KASTEL), Karlsruhe Institute of Technology
Thursday, April 9, 1 PM EDT
RSVP: luma.com/920d2h7p
We are in Montréal, demanding frontier labs CEOs to commit to pausing AI frontier development, if the other labs do the same.
Nous sommes à Montréal, demandant que les PDGs d'IA s’engagent à suspendre le développement de l’IA frontière si les autres compagnies le font aussi.
B R O A D T I M E L I N E S
We should have neither short AI timelines, nor long timelines, but a broad probability distribution over when transformative AI will arrive.
My new essay explains why & explores the implications of such deep uncertainty.
🧵 1/
AI Control Hackathon this weekend!
Given a misaligned model that may be actively trying to subvert safety measures, how can we design protocols that prevent catastrophic outcomes?
RSVP: luma.com/mhitd3xv
Joignez-nous à Montréal ce samedi 13-15h aux bureaux de Google, pour demander aux PDGs d'arrêter la Course à l'IA!
Join us in Montréal this Saturday 1-3pm at Google's offices, to demand the CEOs to Stop the AI Race!
luma.com/vw3nk8e6?tk=...
New post! Solar storms are damaging and expensive, are a tail risk for catastrophic harm, and can be averted straightforwardly and cheaply (only we haven't done so).
www.lesswrong.com/posts/ghq9Ew...
Montréal AI safety, ethics, and governance newsletter, March 2026 edition
- Intl. AI Safety Report: risk mgmt still voluntary
- 5 Montréal AI safety events this month
- CIFAR puts $1M toward alignment research
- Local papers on interpretability & hallucinations
aisafetymontreal.org/newsletter/2...
Within the next year we will have superforecaster-level AI. Their predictions would spread in the news, policy, planning, markets. But LLMs are highly correlated, so their shared biases and correlated failures like systematic overconfidence would propagate further into our collective epistemics.
Commentary: Anyone Else Have Those Weird Dreams Where Sobbing Future Generations Beg You To Change Course?
Ran the Qwen 3.5 MoE family (3B–17B active params) on 155 recent prediction questions from ForecastBench. All are not well calibrated: overconfident when predicting near 100%, and many predictions clustered around 50% (hedging/low sharpness).
In 2006, DARPA had a research program (HI-MEMS) on implanting electrodes into insects during metamorphosis, so developing tissue would integrate them, to control their locomotion remotely.
Another approach which may be cleaner is using t-of-n threshold cryptography, where the PDS is one of the n shareholders but can never meet the threshold alone. Whenever a user wants to write to the PDS, their device co-signs.
FROST does this and is a standard as of 2024 in RFC 9591.
An active user might do hundreds+ signed commits to a PDS in a session (post, reply, liking, following, etc).
Self-hosting a PDS is inconvenient and unreliable relative to using specialized hosting services.
A path forward may be *short-lived delegated signing keys*, with user owning root keys.
The AI public benefit corporations do have humanity as their stated duty. Unfortunately, what they actually target is "what is tolerable by American law".
All the other AI companies are traditional corporations, which structurally do not even target the public benefit.
There was UN Secretary-General's High-level Advisory Body on Artificial Intelligence, established in 2023 with members from 33 countries, which released its final report "Governing AI for Humanity" in September 2024.
Its first recommendation was the creation of this Scientific Panel.
We need international red lines to prevent unacceptable AI risks.
Ban AI towards lethal autonomous weapons, mass surveillance, nuclear command & control, bioweapon assistance, unsupervised control of critical infrastructure, disinformation, CSAM, social scoring, and recursive self-improvement R&D.
Out of curiosity I asked Claude Opus about contemporary techniques vs this problem space. It created this web app comparing different methods claude.ai/public/artif... which you may find interesting
early physics of the mind fire