Aqulia-med LLM: Pioneering Full-process Open-source Medical Language Models

Zhao Lulu, Zeng Weihao, Shi Xiaofeng, Zhou Hua, Hao Donglin, Lin Yonghua. Arxiv 2024

[Paper]
Agentic Efficiency And Optimization Fine Tuning Pretraining Methods Reinforcement Learning Training Techniques

Recently, both closed-source LLMs and open-source communities have made significant strides, outperforming humans in various general domains. However, their performance in specific professional fields such as medicine, especially within the open-source community, remains suboptimal due to the complexity of medical knowledge. We propose Aquila-Med, a bilingual medical LLM based on Aquila, addressing these challenges through continue pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF). We construct a large-scale Chinese and English medical dataset for continue pre-training and a high-quality SFT dataset, covering extensive medical specialties. Additionally, we develop a high-quality Direct Preference Optimization (DPO) dataset for further alignment. Aquila-Med achieves notable results across single-turn, multi-turn dialogues, and medical multiple-choice questions, demonstrating the effectiveness of our approach. We open-source the datasets and the entire training process, contributing valuable resources to the research community. Our models and datasets will released at https://huggingface.co/BAAI/AquilaMed-RL.

The Large Language Model Bible

Aqulia-med LLM: Pioneering Full-process Open-source Medical Language Models

Zhao Lulu, Zeng Weihao, Shi Xiaofeng, Zhou Hua, Hao Donglin, Lin Yonghua. Arxiv 2024

Similar Work