Abstract: Language models (LMs) are essential in natural language processing and vision-language modeling. However, several challenges arise in pre-training and fine-tuning of LMs. First, when learning through unsupervised pre-training, information that are semantically irrelevant may negatively affect downstream tasks, leading to negative transfer. Second, cross-modal masked language modeling is often used to learn vision-language associations in vision-language models. However, existing masking strategies may be insufficient in that the masked tokens can sometimes be simply recovered with only the language information, ignoring the visual inputs. Lastly, prompt tuning is effective in fine-tuning LMs on downstream tasks with limited labeled samples, but prompt design is difficult.
To tackle these issues, we propose several measures. First, we introduce a new pre-training method that trains each expert with only semantically relevant data through cluster-conditional gates. This allows downstream tasks be allocated to customized models pre-trained on data most similar to the downstream data. Second, on pre-training vision-language models, we use a masking strategy based on the saliencies of language tokens to the image. Lastly, we use meta-learning to learn an efficient prompt pool that can extract diverse knowledge from historical tasks. This allows instance-dependent prompts to be constructed from the pool without tuning the whole LM. Experimental results show that these measures can significantly improve the performance of LMs.
Bio: James Kwok is a Professor in the Department of Computer Science and Engineering, Hong Kong University of Science and Technology. He is an IEEE Fellow. He has served / is serving as an Associate Editor for the IEEE Transactions on Neural Networks and Learning Systems, Neural Networks, Neurocomputing, Artificial Intelligence Journal, International Journal of Data Science and Analytics, and on the Editorial Board of Machine Learning. He is also serving as Senior Area Chairs of major machine learning / AI conferences including NeurIPS, ICML, ICLR, IJCAI. He is on the IJCAI Board of Trustees. He is recognized as the Most Influential Scholar Award Honorable Mention for “outstanding and vibrant contributions to the field of AAAI/IJCAI between 2009 and 2019”. Prof Kwok will be the IJCAI-2025 Program Chair.