Claude 2 Feels Different
anthropic released claude 2 a few days ago and opened it up to more people. i've been playing with it non-stop.
something about it feels... different.
what's new
the obvious stuff:
- longer responses (100k context window!)
- better at code
- more up-to-date knowledge
- generally more capable
but that's not what caught my attention.
the thing i noticed
when i ask claude something it's uncertain about, it says so. not grudgingly. not with lots of caveats. it just... admits uncertainty.
"I'm not entirely sure about this, but based on what I know..."
"This is outside my training data, so I might be wrong here..."
compare that to chatgpt, which often confidently states things that turn out to be wrong.
i don't know if this is technically "better." but it feels more honest.
testing it
i asked both claude 2 and gpt-4 some questions i knew the answers to.
on factual stuff: roughly equal.
on uncertain territory (recent events, niche topics): claude was more likely to express uncertainty. gpt-4 was more likely to make something up.
sample size is small. not scientific. but noticed.
the philosophy
i went down a rabbit hole on anthropic's research. their approach is fundamentally different from openai's.
they focus on:
- constitutional AI: training models to follow principles, not just predict text
- reducing harmful outputs: before maximizing capability
- interpretability: understanding why models do what they do
openai is "move fast, build capability, figure out safety later." anthropic is "build safety into the process from the start."
both might work. but i know which approach makes me sleep better at night.
why i care
i'm interested in working at one of these companies eventually. or at least contributing to this field.
and the more i read about anthropic, the more i think: these are the questions i care about.
not just "can AI do X?" but "should AI do X?" and "how do we make sure AI does X safely?"
the honest take
claude 2 isn't a game-changer in capability. gpt-4 is probably still slightly ahead on raw benchmarks.
but anthropic's approach interests me more than openai's. the philosophy behind the product matters.
maybe that's naive. maybe capability is all that matters in the market. but i hope not.
i've started reading anthropic's papers more seriously. the constitutional AI paper is fascinating. and confusing. mostly confusing.