(1)
Aderinokun, A. Unified Multimodal Transformers: Improving Vision-Language Models With Knowledge-Guided Attention Mechanisms. MZJAI 2024, 1.