Last week Anthropic launched Claude 3.5 Sonnet, with particularly notable improvements in graduate-level reasoning, code, and mixed evaluations (BIG-Bench Hard, which, contrary to popular belief, is not what I say when watching guys bench press at the gym, but rather
Claude 3.5 Sonnet, Imbue 70B, and models of…
Last week Anthropic launched Claude 3.5 Sonnet, with particularly notable improvements in graduate-level reasoning, code, and mixed evaluations (BIG-Bench Hard, which, contrary to popular belief, is not what I say when watching guys bench press at the gym, but rather