Claude Opus had a MASK honesty rate of 91.7 percent, compared to 90.3 percent for Opus 4.6 and 89.1 percent for Sonnet 4.6.
The company says Mythos is too dangerous to release publicly. Cybersecurity experts agree the model's capabilities matter, ...
Security researchers used GPT-5.4 and Claude Opus 4.6 in an open-source harness to reproduce Anthropic's Mythos vulnerability ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results