25 Interpretability

Media Summary: MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ... This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ...

25 Interpretability - Detailed Analysis & Overview

MIT 6.S897 Machine Learning for Healthcare, Spring 2019 Instructor: Peter Szolovits View the complete course: ... This is a talk I gave to my MATS 9.0 training scholars about the big picture of mech interp - as of Oct 2025, what had changed? What's happening inside an AI model as it thinks? Why are AI models sycophantic, and why do they hallucinate? Are AI models ... How can we reverse engineer what a neural network is doing? In this IASEAI ' Take your personal data back with Incogni! Use code WELCHLABS at the link below and get 60% off an annual plan: ... A surprising fact about modern large language models is that nobody really knows how they work internally. At Anthropic, the ...

Visit our sponsor 80000 hours - grab their free career guide and check out their podcast! Use our ... Part 1 of a walkthrough of our paper, Progress Measures for Grokking via Mechanistic EuroPython 2025 — South Hall 2B on 2025-07-17] *Hacking LLMs: An Introduction to Mechanistic This 5 minute video explains the difference between global Art by Clipped from episode 19 of AXRP: Transcript of that episode: ... Neel Nanda (Google DeepMind) discussed his mechanistic

A talk I gave to my MATS 9.0 training program about reasoning model Neel Nanda from DeepMind presenting 'Mechanistic