Scalable Oversight Research
Developing techniques to keep highly capable models helpful and honest, even as they surpass human-level intelligence in various domains.
Projects
A collection of research projects, engineering work, and tools I've contributed to. Most of this work is internal to Anthropic, but here's what I can share.
Most of my work is internal research at Anthropic and can't be shared publicly.
If you're interested in learning more about what we do, check out our published research.