James MacGlashan (@jmac-ai) Bsky

I've sometimes done this but it never sticks because my finger dexterity is just bad enough to make it uncomfortable :p

6 days ago 1 0 0 0

Now if you really want something to worry about, it's not whether AI can be aligned, but who aligns it!

1 week ago 1 0 1 0

We also have good frameworks for adaptation. Myself and colleagues wrote about how POMDPs over hidden rewards endows the right decision-making for refining understanding.

The hard parts are the "normal" AI problems.
cdn.aaai.org/ocs/ws/ws017...

1 week ago 1 0 1 0

RE: Corrigibility, yes adapting from feedback is preferable.

A lot of my old postdoc research involved agents that learned what people wanted from various forms of feedback while pursuing it.

Again, the hard part was smart agents that can understand and infer, not setting them up to pursue.

1 week ago 1 0 1 0

When you layer in that we have extremely strong incentives to align agents, because otherwise it's an awful product, it would be bizarre that we wouldn't make ever more aligned agents as we make smarter systems.

1 week ago 1 0 1 0

The same holds outside my work.

Again, doomers used to be worried about common-sense understanding being something that could be neglected to our peril. There were a million obtuse-genie arguments.

Then LLMs came along and showed how wrong those concerns were. Smarter systems, better alignment.

1 week ago 1 0 1 0

Let me put it this way. Doomer's seem to think alignment is hard because it's separate from capability.

But my ~20 years of agent research is the exact opposite. Alignment is made easy by smarter systems. And I see no reason that would change in the future.

1 week ago 1 0 1 0

Regarding them holding up, they hold up as well as we can build systems that can _actually_ understand us. If they do, they will hold up quite well! If they don't, then they also have a serious understanding problem that impacts their capability.

1 week ago 1 0 1 0

I'm not sure evolution is a good analogy here. Evolution isn't really optimization. Evolution as optimization is an okay model to think about some things, but it breaks down quickly. It doesn't have a goal. Even describing it as a non-stationary optimization process doesn't capture it well.

1 week ago 0 0 1 0

Why do you think defining pursuit is hard? This is incredibly easy, IMO. I make agent pursue things on a daily basis and we do it with LLMs too.

All the complexity I see stems from difficulty having systems that understand the thing to pursue and in capability to pursue it.

1 week ago 0 0 1 1

To make this more concrete: which is harder: spinning up RLHF given the reward model, or learning the reward model.

The former is trivial. The hard part is in the latter, but that’s just a basic AI problem of building systems that can understand.

1 week ago 1 0 0 1

Reward hacking can be interpreted as one of two problems.
1. The goal was specified incorrectly.
2. There are bad suboptimal solutions.

The former is the problem of having a machine understand goals, not of asking it to act on it.

The latter is it’s not a capable agent to begin with.

1 week ago 1 0 1 0

They want whatever the function is. You agreed that we can make AI understand the goal. If you admit that you can build a function of understanding the goal then it is straightforward to use that function as the goal.

1 week ago 1 0 1 0

There is plenty of other research involving more complex agents with more explicitly defined goals.

And certainly if you look at any RL research, learning with goals is the name of the game.

1 week ago 1 0 1 0

It's also worth bearing in mind that these models work as "agents" at all because they underwent RL training where an explicit goal was provided.

Sometimes simple and verifiable, but RLFH predated that and modeled messy user preferences.

1 week ago 1 0 1 0

Making them want it is far easier than understanding it. If you have the function, you can define the optimization.

LLM "agents" today are unsophisticated and mostly just "unroll" models because its computationally cheap. But the biggest improvements are coming more sophisticated "agent harnesses."

1 week ago 1 0 1 1

The question "how do we make a machine understand a complex goal" is inherently an AI problem. It is not disjoint from other forms of understanding.

It gets solved by working on the AI problem. If we don't solve it, we won't have capable AI.

It's weird to think we'd solve it, but not use the tech.

1 week ago 1 0 1 0

Furthermore, the tech that makes AI more capable in a social world also facilitates goal specification.

MIRI used to worry about the complexity of our vague social goals being lost on AI. Then LLMs happened and this part is _much_ easier.

The more we work on "agents" the more that develops.

1 week ago 1 0 1 0

I've often argued that it is well aligned because of innate incentives.

No one wants AI that doesn't do what we want. We are forced to improve goal specification as capability improves.

Even with something as safe as video game AI that my team works on, there is a huge amount of "alignment" work.

1 week ago 1 0 1 1

Plausibly other models do it too, but I've noticed it far more with the Gemini models. (Or whatever AI Mode uses, which I believe is Gemini)

1 week ago 1 0 0 0

Something I find kind of unique to Google's Gemini models is when "discussing" a topic, they will often end their turn by asking what I think about some other relationship.

This is actually awesome because one of my main uses of LLMs is to work through ideas/thoughts like a rubber duck++

1 week ago 3 0 1 0

Entangling AI technology with religion is a colossally bad idea.

1 week ago 2 0 0 0

This is an excellent post.

I've made similar arguments against doomerism before, but I never made the connection to it being another flavor of the "counting argument" even though I've argued against its other flavors before too!

This framing exposes the flaw very clearly.

1 week ago 3 0 0 0

I’m just going to refer to this next time people tell me LLMs are experiencing human like emotions purely because it appears to.

1 week ago 7 0 0 0

I can imagine different arguments.
- No because the polynomial is sufficient and gates reduce the horizon too much
- Yes for modeling sharp transitions and preventing slow growth

I'm curious if you all thought about this already. (There's also the option of interleaving these blocks.)

1 week ago 0 0 0 0

I have a more abstract question.

Casual PoM is a linear RNN where the transition matrix A is identity and the residual is the polynomial non-linearity.

Recent RNNs have found making A content dependent to be crucial; especially gating (Gated Delta Net).

Would casual PoM benefit from that too?

1 week ago 0 0 1 0

UW graduate student deported through SEA as protesters demand answers A union representing University of Washington graduate student workers says Kennedy Orwa’s student visa was rescinded without explanation.

Our PhD student Kennedy Orwa, who studies applications of AI to health care, was hastily deported today to Kenya along with his 13-year-old son without opportunity to speak to legal counsel.

King 5 reports that he held a valid visa that was rescinded without explanation.

1 week ago 4159 2281 87 113

Yes, increasing the budget for that and more is what I advocated.

If you're not interested in reading the thread, that's fine, but you know don't have to respond at all then! It's not a discussion if you're just throwing untethered comments out there :p

1 week ago 0 0 0 0

Pretty neat all the same. I'm hopeful a future where locally run is productive isn't that far off.

1 week ago 0 0 0 0

Yep, like I said my second post in the thread, education sadly often being boring to kids makes this compilated!

In my ideal world, we massively increase our budget for education to make it more fun like it should be. Not too long ago I made a thread on that.

bsky.app/profile/jmac...

1 week ago 0 0 1 0

Posts by James MacGlashan