Thomas Bush
London
Hi, I’m Tom! My goal is to ensure advanced AI lives up to its potential to benefit humanity. My current research revolves around the unifying theme of safe and reliable AI. Specifically, I am interested in answering the following two questions:
- Under what conditions do models learn to reason in a safe and reliable fashion?
- How can we verify whether a model is reasoning in a safe and reliable manner?
To this end, my primary research interests are (i) mechanistic interpretability, (ii) reinforcement learning and (iii) world models. I am especially excited about research at the intersection of these topics.
I am currently a MATS research scholar supervised by Adria Garriga-Alonso. Until recently, I was also research assistant at Krueger AI Safety Lab supervised by Prof David Krueger, and working with Usman Anwar and Stephen Chung. Previously, I was a MPhil student in Machine Learning and Machine Intelligence at the University of Cambridge, and a BSc student in Philosophy, Politics and Economics at LSE. I am currently actively looking for research opportunities and PhD positions so please reach out!