Medprompt: Cross-modal Prompting For Multi-task Medical Image Translation · The Large Language Model Bible Contribute to LLM-Bible

Medprompt: Cross-modal Prompting For Multi-task Medical Image Translation

Chen Xuhang, Pun Chi-man, Wang Shuqiang. Arxiv 2023

[Paper]    
Merging Model Architecture Multimodal Models Pretraining Methods Prompting Reinforcement Learning Tools Transformer

Cross-modal medical image translation is an essential task for synthesizing missing modality data for clinical diagnosis. However, current learning-based techniques have limitations in capturing cross-modal and global features, restricting their suitability to specific pairs of modalities. This lack of versatility undermines their practical usefulness, particularly considering that the missing modality may vary for different cases. In this study, we present MedPrompt, a multi-task framework that efficiently translates different modalities. Specifically, we propose the Self-adaptive Prompt Block, which dynamically guides the translation network towards distinct modalities. Within this framework, we introduce the Prompt Extraction Block and the Prompt Fusion Block to efficiently encode the cross-modal prompt. To enhance the extraction of global features across diverse modalities, we incorporate the Transformer model. Extensive experimental results involving five datasets and four pairs of modalities demonstrate that our proposed model achieves state-of-the-art visual quality and exhibits excellent generalization capability.

Similar Work