Study Record/AI Data Science
Contrastive Learning
Sungyeon Kim
2024. 6. 28. 00:10
1. Contrastive Learning
1) 초기에는 이미지 task을 위해 이미지 representation만을 학습한 모델이
2) language랑 멀티모달 task로 확장되는 것을 contrasting learning이라 함
** 핵심은 이미지랑 의미적으로 유사한 (sementically related) text pair을 가져오는 것.
2. CLIP (Contrastive Language-Image Pre-training)
1) separate encoders for images and text
2) project them into a shared latent space
3) the model is trained on image-text pairs
4) only align image-text pairs
3. Triple Contrastive Learning (TCL)
1) incorporate both cross-modal and intra-modal objectives
2) not only align image-text pairs but also ensure that similar inputs within the same modality are close in the representiation space.