Claude Opus 4.6

The model’s performance is state-of-the-art on several evaluations. For example, it achieves the highest score on the agentic coding evaluation Terminal-Bench 2.0 and leads all other frontier models on Humanity’s Last Exam, a complex multidisciplinary reasoning test -- anthropic.com.

Posted or updated February 17, 2026.

Member discussion

The comments section is for paying subscribers only

Upgrade to a paid account

Already have an account? Sign in