There are similarly loads of ways that the system dynamics resulting from billions of interacting, individually pro-social, AIs can go wrong and weird.
Posts by Jascha Sohl-Dickstein
It helps if participants in government (/corporations/economies/...) are good faith. But there are loads of ways that well-intentioned smart people can achieve terrible group outcomes.
[Alignment of systems built out of AIs] is to [AI alignment], what [good governance] is to [raising an ethical human].
Gradient estimators, not gradient resonators
version of the meta-loss. the current best version of the algorithm: openreview.net/forum?id=Vhb...
We would avoid it if we could! The challenge is that when you descend the meta-loss by gradient decent, you converge into the perilous region. It's hard to know how to exclude it. Our best approaches so far use stochastic finite difference gradient resonators (variants of ES) to descend a smoothed
The boundary between trainable and untrainable configurations is *fractal*! (and beautiful!)
For details, and more pretty videos, see:
blog post: sohl-dickstein.github.io/2024/02/12/f...
paper: arxiv.org/abs/2402.06184
Examples of fractals resulting from neural network training in a variety of experimental configurations
Have you ever done a dense grid search over neural network hyperparameters? Like a *really dense* grid search? It looks like this (!!). Blueish colors correspond to hyperparameters for which training converges, redish to those for which training diverges.
Even better, a video: vimeo.com/903855670
The “principle of indifference” is often presented as an intuitively obvious motivation for specifying “non-informative” prior models. Unfortunately that intuition quickly falls apart in many common applications. A long thread about applied probability theory!
All the appointments are filled. Will see how the meetings go, and evaluate doing this again. I'm looking forward to finding out what people are interested in!
I created 11 meeting slots for this first round. I'll reply to this message when/if they've all filled up.
calendar.app.google/ZdnWzeYw3qyR...
I'm running an experiment, and holding some public office hours (inspired by Kyunghyun Cho doing something similar).
Talk with me about anything! Ask for advice on your research or startup or career or I suppose personal life, brainstorm new research ideas, complain about mistakes I've made, ...
Paper here: arxiv.org/abs/2311.02462
(Imagen is a Level 3 "Expert" Narrow AI)
These Levels of AGI provide a rough framework to quantify the performance, generality, and autonomy of AGI models and their precursors. We hope they help compare models, assess risk, and measure progress along the path to AGI.
(AlphaGo is a Level 4 "Virtuouso" Narrow AI)
Levels of AGI: Operationalizing Progress on the Path to AGI
Levels of Autonomous Driving are extremely useful, for communicating capabilities, setting regulation, and defining goals in self driving.
We propose analogous Levels of *AGI*.
(ChatGPT is a Level 1 "Emerging" AGI)
AI can enable awesome (as in inspiring of awe) good in the world. We have amazing leverage as the people building it. We should use that leverage carefully.
A theme is that we should worry about a *diversity* of risks. If we recognize e.g. only specific present harms, or only AGI misalignment risk, we will find our efforts overwhelmed by other types of AI-enabled disruption, and we won't be able to fix the problem we care about.
My top fears include targeted manipulation of humans, autonomous weapons, massive job loss, AI-enabled surveillance and subjugation, widespread failure of societal mechanisms, extreme concentration of power, and loss of human control.
AI has the power to change the world in both wonderful and terrible ways. If we exercise care, the wonderful outcomes will be much more likely than the terrible ones. Towards that end, here is a brain dump of my thoughts about how AI might go wrong.
sohl-dickstein.github.io/2023/09/10/d...