Abstract: Language models (LMs) are essential in natural language processing and vision-language modeling. However, several challenges arise in pre-training and fine-tuning of LMs. First, when learning through unsupervised pre-training, information that are semantically irrelevant may negatively affect downstream tasks, leading to negative transfer. Second, cross-modal masked language modeling is often used to learn vision-language associations… Continue reading Enhancing Language Models through Improved Pre-Training and Fine-Tuning