Large Language Models Are Learnable Planners For Long-term Recommendation

Shi Wentao, He Xiangnan, Zhang Yang, Gao Chongming, Li Xinyue, Zhang Jizhi, Wang Qifan, Feng Fuli. Arxiv 2024

[Paper] [Code]
Agentic Has Code RAG Reinforcement Learning Tools Training Techniques

Planning for both immediate and long-term benefits becomes increasingly important in recommendation. Existing methods apply Reinforcement Learning (RL) to learn planning capacity by maximizing cumulative reward for long-term recommendation. However, the scarcity of recommendation data presents challenges such as instability and susceptibility to overfitting when training RL models from scratch, resulting in sub-optimal performance. In this light, we propose to leverage the remarkable planning capabilities over sparse data of Large Language Models (LLMs) for long-term recommendation. The key to achieving the target lies in formulating a guidance plan following principles of enhancing long-term engagement and grounding the plan to effective and executable actions in a personalized manner. To this end, we propose a Bi-level Learnable LLM Planner framework, which consists of a set of LLM instances and breaks down the learning process into macro-learning and micro-learning to learn macro-level guidance and micro-level personalized recommendation policies, respectively. Extensive experiments validate that the framework facilitates the planning ability of LLMs for long-term recommendation. Our code and data can be found at https://github.com/jizhi-zhang/BiLLP.

The Large Language Model Bible

Large Language Models Are Learnable Planners For Long-term Recommendation

Shi Wentao, He Xiangnan, Zhang Yang, Gao Chongming, Li Xinyue, Zhang Jizhi, Wang Qifan, Feng Fuli. Arxiv 2024

Similar Work