Joe Carlsmith's Substack
Subscribe
Sign in
Home
My website
Archive
About
Latest
Top
Discussions
Video and transcript of talk on AI welfare
An overview of my take on AI welfare as of May 2025, from a talk at Anthropic.
May 22
•
Joe Carlsmith
4
Share this post
Joe Carlsmith's Substack
Video and transcript of talk on AI welfare
Copy link
Facebook
Email
Notes
More
1
The stakes of AI moral status
On seeing and not seeing souls.
May 21
30
Share this post
Joe Carlsmith's Substack
The stakes of AI moral status
Copy link
Facebook
Email
Notes
More
9
April 2025
Video and transcript of talk on automating alignment research
From a talk at Anthropic in April 2025
Apr 30
•
Joe Carlsmith
5
Share this post
Joe Carlsmith's Substack
Video and transcript of talk on automating alignment research
Copy link
Facebook
Email
Notes
More
1
Can we safely automate alignment research?
It's really important; we have a real shot; there are a lot of ways we can fail.
Apr 30
•
Joe Carlsmith
3
Share this post
Joe Carlsmith's Substack
Can we safely automate alignment research?
Copy link
Facebook
Email
Notes
More
March 2025
AI for AI safety
We should try extremely hard to use AI labor to help address the alignment problem.
Mar 14
•
Joe Carlsmith
9
Share this post
Joe Carlsmith's Substack
AI for AI safety
Copy link
Facebook
Email
Notes
More
2
Paths and waystations in AI safety
On the structure of the path to safe superintelligence, and some possible milestones along the way.
Mar 11
•
Joe Carlsmith
6
Share this post
Joe Carlsmith's Substack
Paths and waystations in AI safety
Copy link
Facebook
Email
Notes
More
1
February 2025
When should we worry about AI power-seeking?
Examining the conditions required for rogue AI behavior.
Feb 19
•
Joe Carlsmith
2
Share this post
Joe Carlsmith's Substack
When should we worry about AI power-seeking?
Copy link
Facebook
Email
Notes
More
1
What is it to solve the alignment problem?
Also: to avoid it? Handle it? Solve it forever? Solve it completely?
Feb 13
•
Joe Carlsmith
14
Share this post
Joe Carlsmith's Substack
What is it to solve the alignment problem?
Copy link
Facebook
Email
Notes
More
1
How do we solve the alignment problem?
Introduction to an essay series on paths to safe, useful superintelligence.
Feb 13
•
Joe Carlsmith
14
Share this post
Joe Carlsmith's Substack
How do we solve the alignment problem?
Copy link
Facebook
Email
Notes
More
January 2025
Fake thinking and real thinking
When the line pulls at your hand.
Jan 28
•
Joe Carlsmith
78
Share this post
Joe Carlsmith's Substack
Fake thinking and real thinking
Copy link
Facebook
Email
Notes
More
5
December 2024
Takes on "Alignment Faking in Large Language Models"
What can we learn from recent empirical demonstrations of scheming in frontier models?
Dec 18, 2024
•
Joe Carlsmith
8
Share this post
Joe Carlsmith's Substack
Takes on "Alignment Faking in Large Language Models"
Copy link
Facebook
Email
Notes
More
2
October 2024
Video and transcript of presentation on Otherness and control in the age of AGI
An attempt to distill down the whole "Otherness and control" series into a single talk.
Oct 8, 2024
•
Joe Carlsmith
2
Share this post
Joe Carlsmith's Substack
Video and transcript of presentation on Otherness and control in the age of AGI
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts