News

On the MATH benchmark of competition level math word problems, for example, Meta’s model posted a score of 73.8, compared to GPT-4o’s 76.6 and Claude 3.5 Sonnet’s 71.1.
When employees at Meta started developing their flagship AI model, Llama 3, they faced a simple ethical question. The program would need to be trained on a huge amount of high-quality writing to ...