Joe Carlsmith's Substack
Subscribe
Sign in
Home
My website
Archive
About
Latest
Top
Discussions
Takes on "Alignment Faking in Large Language Models"
What can we learn from recent empirical demonstrations of scheming in frontier models?
Dec 18, 2024
•
Joe Carlsmith
8
Share this post
Joe Carlsmith's Substack
Takes on "Alignment Faking in Large Language Models"
Copy link
Facebook
Email
Notes
More
2
October 2024
Video and transcript of presentation on Otherness and control in the age of AGI
An attempt to distill down the whole "Otherness and control" series into a single talk.
Oct 8, 2024
•
Joe Carlsmith
2
Share this post
Joe Carlsmith's Substack
Video and transcript of presentation on Otherness and control in the age of AGI
Copy link
Facebook
Email
Notes
More
September 2024
Extended audio/transcript from my conversation with Dwarkesh Patel
Extra content on my Otherness series, AI takeover, p(God), and more.
Sep 30, 2024
•
Joe Carlsmith
4
Share this post
Joe Carlsmith's Substack
Extended audio/transcript from my conversation with Dwarkesh Patel
Copy link
Facebook
Email
Notes
More
June 2024
Loving a world you don’t trust
Garden, campfire, healing water.
Jun 18, 2024
•
Joe Carlsmith
13
Share this post
Joe Carlsmith's Substack
Loving a world you don’t trust
Copy link
Facebook
Email
Notes
More
2
March 2024
On attunement
Examining a certain kind of meaning-laden receptivity to the world.
Mar 25, 2024
•
Joe Carlsmith
23
Share this post
Joe Carlsmith's Substack
On attunement
Copy link
Facebook
Email
Notes
More
5
Video and transcript of presentation on Scheming AIs
An intro to my work on scheming/"deceptive alignment."
Mar 22, 2024
•
Joe Carlsmith
1
Share this post
Joe Carlsmith's Substack
Video and transcript of presentation on Scheming AIs
Copy link
Facebook
Email
Notes
More
On green
Examining a philosophical vibe that contrasts with "deep atheism."
Mar 21, 2024
•
Joe Carlsmith
11
Share this post
Joe Carlsmith's Substack
On green
Copy link
Facebook
Email
Notes
More
4
January 2024
On the abolition of man
What does it take to avoid tyranny towards the future?
Jan 18, 2024
•
Joe Carlsmith
5
Share this post
Joe Carlsmith's Substack
On the abolition of man
Copy link
Facebook
Email
Notes
More
1
Being nicer than Clippy
Let's be the sort of species that aliens wouldn't fear the way we fear paperclippers.
Jan 16, 2024
•
Joe Carlsmith
6
Share this post
Joe Carlsmith's Substack
Being nicer than Clippy
Copy link
Facebook
Email
Notes
More
1
An even deeper atheism
Who isn't a paperclipper?
Jan 11, 2024
•
Joe Carlsmith
4
Share this post
Joe Carlsmith's Substack
An even deeper atheism
Copy link
Facebook
Email
Notes
More
Does AI risk “other” the AIs?
Examining Robin Hanson's critique of the AI risk discourse.
Jan 9, 2024
•
Joe Carlsmith
8
Share this post
Joe Carlsmith's Substack
Does AI risk “other” the AIs?
Copy link
Facebook
Email
Notes
More
When "yang" goes wrong
On the connection between deep atheism and seeking control.
Jan 8, 2024
•
Joe Carlsmith
3
Share this post
Joe Carlsmith's Substack
When "yang" goes wrong
Copy link
Facebook
Email
Notes
More
2
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts