Baichuan 2: Open Large-scale Language Models · The Large Language Model Bible Contribute to LLM-Bible

Baichuan 2: Open Large-scale Language Models

Yang Aiyuan, Xiao Bin, Wang Bingning, Zhang Borong, Bian Ce, Yin Chao, Lv Chenxu, Pan Da, Wang Dian, Yan Dong, Yang Fan, Deng Fei, Wang Feng, Liu Feng, Ai Guangwei, Dong Guosheng, Zhao Haizhou, Xu Hang, Sun Haoze, Zhang Hongda, Liu Hui, Ji Jiaming, Xie Jian, Dai Juntao, Fang Kun, Su Lei, Song Liang, Liu Lifeng, Ru Liyun, Ma Luyao, Wang Mang, Liu Mickel, Lin Mingan, Nie Nuolan, Guo Peidong, Sun Ruiyang, Zhang Tao, Li Tianpeng, Li Tianyu, Cheng Wei, Chen Weipeng, Zeng Xiangrong, Wang Xiaochuan, Chen Xiaoxi, Men Xin, Yu Xin, Pan Xuehai, Shen Yanjun, Wang Yiding, Li Yiyu, Jiang Youxin, Gao Yuchen, Zhang Yupeng, Zhou Zenan, Wu Zhiying. Arxiv 2023

[Paper]    
Training Techniques

Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.

Similar Work