What is it to solve the alignment problem?

Feb 13

Also: to avoid it? Handle it? Solve it forever? Solve it completely?

1 Comment

It’s good to have a paper like this that fully explicates what is meant when we say a word that the industry has nearly obliterated with overuse. Well done. Are there any current alignment projects incorporating RL or other AI tech into their approaches that you’re excited about? I feel like humans vibing their way to value sets for safety and control may be a more short-lived solution than we thought a year ago.

Expand full comment

Joe Carlsmith's Substack

What is it to solve the alignment problem?