Improving Zero-Shot Transfer Learning in Vision-Language Models via Multimodal Contrastive Alignment
Abstract
Zero-shot transfer learning aims to extend the capabilities of vision-language models to novel tasks without explicit task-specific training. This paper proposes a novel approach to improve zero-shot transfer learning by leveraging multimodal contrastive alignment. By enhancing the alignment between visual and textual modalities, the proposed method increases the model’s ability to generalize across unseen tasks, improving performance and robustness in various applications.
Downloads
Published
2024-08-13
Issue
Section
Articles
License
Copyright (c) 2024 MZ Computing Journal
This work is licensed under a Creative Commons Attribution 4.0 International License.