The CLIP method (Contrastive Language-Image Pre-training) involves connecting image and text by learning common ...