Symbol-llm: Towards Foundational Symbol-centric Interface For Large Language Models

Xu Fangzhi, Wu Zhiyong, Sun Qiushi, Ren Siyu, Yuan Fei, Yuan Shuai, Lin Qika, Qiao Yu, Liu Jun. Arxiv 2023

[Paper]
Reinforcement Learning Tools Training Techniques

Although Large Language Models (LLMs) demonstrate remarkable ability in processing and generating human-like text, they do have limitations when it comes to comprehending and expressing world knowledge that extends beyond the boundaries of natural language(e.g., chemical molecular formula). Injecting a collection of symbolic data directly into the training of LLMs can be problematic, as it disregards the synergies among different symbolic families and overlooks the need for a balanced mixture of natural and symbolic data. In this work, we tackle these challenges from both a data and framework perspective and introduce Symbol-LLM series models. First, we curated a data collection consisting of 34 tasks and incorporating approximately 20 distinct symbolic families, intending to capture the interrelations and foster synergies between symbols. Then, a two-stage tuning framework succeeds in injecting symbolic knowledge without loss of the generality ability. Extensive experiments on both symbol- and NL-centric tasks demonstrate the balanced and superior performances of Symbol-LLM series models. The project page is https://xufangzhi.github.io/symbol-llm-page/.

The Large Language Model Bible

Symbol-llm: Towards Foundational Symbol-centric Interface For Large Language Models

Xu Fangzhi, Wu Zhiyong, Sun Qiushi, Ren Siyu, Yuan Fei, Yuan Shuai, Lin Qika, Qiao Yu, Liu Jun. Arxiv 2023

Similar Work