Learning Notebook - David Rostcheck

learning_event details

Learning Event ID
Subject
Topic
Program
Length
Institution
Presenter
Format
Recorded Date
Completed Date
Notes	Anthropic releases Claude 3, outperforming all models including GPT-4 Once again, we have this out to The Memo readers within just a few hours of model release. Key points: Alan’s estimate for Claude 3 Opus: 2T parameters trained on 40T tokens. 3 models sizes: Haiku (~20B), Sonnet (~70B), and Opus (~2T). Trained with synthetic data, probably generated by Claude 2.1 or GPT-4. New highest MMLU score (Claude 3=86.8 vs GPT-4=86.4). Long context (working memory) = 200K standard, 1M for researchers. Multimodal: Has vision, like GPT-4V and Gemini. Also has ‘tool use’ built-in. My initial testing shows Claude 3 Opus to be on par with GPT-4, and perhaps better in some metrics. This is the one to beat! (And we’ll beat it shortly.)
Personal Notes
Link
Review