AI Sandbagging - Computerphile

Name: AI Sandbagging - Computerphile
Uploaded: May 23, 2025
Duration: 700 s

Computerphile2.62M subscribers

107.3K views

May 23, 2025

11:40

Following the theme of AI research and safety, Aric Floyd talks about how some Large Language Models might follow the all too human trait of sandbagging - "lying" about their true capabilities. AI Sandbagging Paper: https://www.apolloresearch.ai/research/scheming-reasoning-evaluations Computerphile is supported by Jane Street. Learn more about them (and exciting career opportunities) at: https://jane-st.co/computerphile This video was filmed and edited by Sean Riley. Computerphile is a sister project to Brady Haran's Numberphile. More at https://www.bradyharanblog.com

Download

1 formats

Video Formats

360pmp432.9 MB

Download

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.