Stefan Heimersheim

Research Scientist at Apollo Research

Stefan Heimersheim is currently working as a Research Scientist at Apollo Research. Prior to this, Stefan held the title of Society Events Officer at Cambridge Existential Risks Initiative. Additionally, Stefan briefly worked as a Machine Learning Alignment Theory Scholar at Stanford Existential Risks Initiative. Stefan has also served as a Volunteer Team Lead at Effective Altruism Global: DC, and as the Graduate Student Body President at Clare Hall, University of Cambridge. Stefan holds a Doctor of Philosophy (PhD) in Astronomy from the University of Cambridge, along with various other degrees and certifications related to AI safety and physics from different institutions.

Location

Cambridge, United Kingdom

Links


Org chart

No direct reports

Teams


Offices


Apollo Research

Apollo Research is an AI safety organization. We specialize in auditing high-risk failure modes, particularly deceptive alignment, in large AI models. Our primary objective is to minimize catastrophic risks associated with advanced AI systems that may exhibit deceptive behavior, where misaligned models appear aligned in order to pursue their own objectives. Our approach involves conducting fundamental research on interpretability and behavioral model evaluations, which we then use to audit real-world models. Ultimately, our goal is to leverage interpretability tools for model evaluations, as we believe that examining model internals in combination with behavioral evaluations offers stronger safety assurances compared to behavioral evaluations alone.


Industries

Employees

1-10

Links