Training LLMs and VLMs through reinforcement learning delivers better results than using hand-crafted examples.