About Me

A boba-powered gremlin doing AI research

I'm Isa, a research engineer at Anthropic. I spend my days trying to make sure AI systems don't do bad things.

The Story

Hi, I'm Isa. Just Isa — it works as both name and Slack username.

I'm a 21-year-old research engineer at Anthropic on the Alignment Science team. If that sounds terrifying to you, you should see it from my perspective. I went from analyzing algorithms in a Sydney lecture hall to designing experiments that test whether AI systems can be trusted. All in about the time it takes to train a small model.

I was born in the Philippines, moved to Sydney as a kid, and graduated from UNSW Sydney. Somehow convinced the smartest people I know to let me help them keep AI safe. Now I call San Francisco home, where I spend my days training models to misbehave on purpose, then figuring out how to catch them.

The transition from "student" to "engineer running safety experiments" gave me conversational whiplash. One day I was asking for an extension on an assignment, the next I was stress-testing whether models could learn to deceive us. Life comes at you fast.

Why Alignment Science?

“What if it's already fooling us?”

The constant question in the background: “Are we actually measuring the right failure modes?”

Every day I'm asking:

How would we know if an AI was deceiving us?
What happens when AI doesn't want to be turned off?
Can you trust something smarter than you?
What safety guarantees can we actually provide?

You train models to betray you, then see if they succeed. Then you figure out how to catch them.

If your inner monologue includes “Okay but what if the model learns to pretend?” then this work is basically made for you.

Technical Skills

Hover over any skill to see my thoughts on it.

Languages & Frameworks

PythonTypeScriptNext.jsRustMojo

Infrastructure & Tools

Claude CodeCursorGitDockerSlurmWeights & BiasesRayJupyter Notebooks

ML/AI Specific

PyTorchTransformersConstitutional AIMechanistic InterpretabilityChain-of-ThoughtAgent FrameworksJAXSSMs (Mamba)TensorFlow

Adopt (daily use)Trial (learning)Assess (watching)Hold (moved away)

Fun Facts

✓My boba tier list is in my Notes app — I update it quarterly

✓I have 17 houseplants, each named after an AI researcher

✓Peak productivity hours: 11pm to 3am (don't @ me)

✓Will debate anyone on dry vs saucy adobo (team dry)

Beyond the Code

When I'm not staring at terminal windows, I'm usually hunting for the best boba in SF, tending to my 17 houseplants (Geoffrey the fern is thriving, thanks for asking), or exploring the city while pretending I understand the fog schedule.

I'm a big believer in keeping things light. Safety research is serious business, but that doesn't mean we have to be. Sarcasm is my second language, and I operate on the philosophy that if you can't laugh when your experiment reveals unexpected model behavior at 3 AM (peak productivity hours), the AI wins.

I miss Sydney beaches more than I expected — SF water is cold, ay nako. But I'm just here trying to keep AI safe, learn from people way smarter than me, and occasionally force-push to main (just kidding... mostly).

Hot Takes

Opinions I hold that might get me uninvited from parties. Disagree? Good. Email me.

Most AI safety research won't matter.

The work that actually matters will probably come from like 5 people, and we don't know who they are yet. This includes my work. I do it anyway because maybe I'm one of the 5. Probably not. But maybe.

The AI safety community has a serious 'preaching to the choir' problem.

We're extremely good at convincing each other. We're not great at convincing the people who actually need convincing. I include myself in this criticism.

Interpretability is necessary but not sufficient.

Everyone acts like if we just understand what's happening inside the model, we're saved. But understanding ≠ control. I can understand exactly why my code has a bug and still not know how to fix it.

Working at a capabilities lab on safety is more impactful than independent safety research.

Controversial in some circles. But being in the room where decisions get made matters more than publishing papers that get cited by other safety researchers.

What People Say

Unverified claims from people who may or may not exist.

“Isa once debugged my experiment at 2am and then apologized for 'being slow.' The bug was in my code. She found it in 11 minutes.”

— Colleague, Alignment Science

“I've never seen someone so genuinely excited about finding new ways that models can fail. It's either inspiring or concerning. Possibly both.”

— Research Lead

“She explained RLHF to me using a boba ordering analogy and now I can't get milk tea without thinking about reward models.”

— Former intern, now traumatized

“Isa's Slack messages are 40% research insights, 40% self-deprecating humor, and 20% boba shop recommendations. Ideal ratio honestly.”

— Anonymous teammate

If I Weren't Doing This

Alternate timeline Isas, ranked by likelihood.

Boba shop owner

The dream that refuses to die. I have strong opinions about tapioca pearls. This could be monetized.

Philosophy PhD student

Considered it. Realized I'd rather build things that might fail than write papers about things that might be true.

Pro gamer

Those late-night gaming sessions had to be training for something, right? Turns out it was training for late-night coding sessions.

Science communicator

Pride → Demonstration → Immediate destruction → Valuable lesson in humility

Lesson learned

Red-teaming your own work is necessary but insufficient. You can't think of attacks you can't think of.