38.2 - Jesse Hoogland on Singular Learning Theory
AXRP - the AI X-risk Research Podcast - En podcast af Daniel Filan

Kategorier:
You may have heard of singular learning theory, and its "local learning coefficient", or LLC - but have you heard of the refined LLC? In this episode, I chat with Jesse Hoogland about his work on SLT, and using the refined LLC to find a new circuit in language models. Patreon: https://www.patreon.com/axrpodcast Ko-fi: https://ko-fi.com/axrpodcast The transcript: https://axrp.net/episode/2024/11/27/38_2-jesse-hoogland-singular-learning-theory.html FAR.AI: https://far.ai/ FAR.AI on X (aka Twitter): https://x.com/farairesearch FAR.AI on YouTube: https://www.youtube.com/@FARAIResearch The Alignment Workshop: https://www.alignment-workshop.com/ Topics we discuss, and timestamps: 00:34 - About Jesse 01:49 - The Alignment Workshop 02:31 - About Timaeus 05:25 - SLT that isn't developmental interpretability 10:41 - The refined local learning coefficient 14:06 - Finding the multigram circuit Links: Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient: https://arxiv.org/abs/2410.02984 Investigating the learning coefficient of modular addition: hackathon project: https://www.lesswrong.com/posts/4v3hMuKfsGatLXPgt/investigating-the-learning-coefficient-of-modular-addition Episode art by Hamish Doodles: hamishdoodles.com