Claude Opus 4.6
The model’s performance is state-of-the-art on several evaluations. For example, it achieves the highest score on the agentic coding evaluation Terminal-Bench 2.0 and leads all other frontier models on Humanity’s Last Exam, a complex multidisciplinary reasoning test -- anthropic.com.
Member discussion
The comments section is for paying subscribers only
Upgrade to a paid accountAlready have an account? Sign in