News
In the exercise, VERSES compared OpenAI advanced reasoning model o1-preview to Genius. Each model attempted to crack the Mastermind code on 100 games with up to ten guesses to crack the code.
Figure 1. The Mastermind game (left) and its biological counterpart (right). The goal is to break the hidden “code” (top) in as few rounds as possible. In each round, the player queries the system ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results