AXRP - the AI X-risk Research Podcast
En podcast af Daniel Filan
58 Episoder
-
34 - AI Evaluations with Beth Barnes
Udgivet: 28.7.2024 -
33 - RLHF Problems with Scott Emmons
Udgivet: 12.6.2024 -
32 - Understanding Agency with Jan Kulveit
Udgivet: 30.5.2024 -
31 - Singular Learning Theory with Daniel Murfet
Udgivet: 7.5.2024 -
30 - AI Security with Jeffrey Ladish
Udgivet: 30.4.2024 -
29 - Science of Deep Learning with Vikrant Varma
Udgivet: 25.4.2024 -
28 - Suing Labs for AI Risk with Gabriel Weil
Udgivet: 17.4.2024 -
27 - AI Control with Buck Shlegeris and Ryan Greenblatt
Udgivet: 11.4.2024 -
26 - AI Governance with Elizabeth Seger
Udgivet: 26.11.2023 -
25 - Cooperative AI with Caspar Oesterheld
Udgivet: 3.10.2023 -
24 - Superalignment with Jan Leike
Udgivet: 27.7.2023 -
23 - Mechanistic Anomaly Detection with Mark Xu
Udgivet: 27.7.2023 -
Survey, store closing, Patreon
Udgivet: 28.6.2023 -
22 - Shard Theory with Quintin Pope
Udgivet: 15.6.2023 -
21 - Interpretability for Engineers with Stephen Casper
Udgivet: 2.5.2023 -
20 - 'Reform' AI Alignment with Scott Aaronson
Udgivet: 12.4.2023 -
Store, Patreon, Video
Udgivet: 7.2.2023 -
19 - Mechanistic Interpretability with Neel Nanda
Udgivet: 4.2.2023 -
New podcast - The Filan Cabinet
Udgivet: 13.10.2022 -
18 - Concept Extrapolation with Stuart Armstrong
Udgivet: 3.9.2022
AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.