Joe Carlsmith's Substack

Joe Carlsmith's Substack

Share this post

Joe Carlsmith's Substack
Joe Carlsmith's Substack
New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?"

New report: "Scheming AIs: Will AIs fake…

Joe Carlsmith
Nov 15, 2023
1

Share this post

Joe Carlsmith's Substack
Joe Carlsmith's Substack
New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?"

I examine the probability of a behavior sometimes called "deceptive alignment."

Read →
Comments
User's avatar
© 2025 Joe Carlsmith
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share