Self-supervised Bot Play For Conversational Recommendation With Justifications

Li Shuyang, Majumder Bodhisattwa Prasad, Mcauley Julian. Arxiv 2021

[Paper]
RAG Reinforcement Learning Tools Training Techniques

Conversational recommender systems offer the promise of interactive, engaging ways for users to find items they enjoy. We seek to improve conversational recommendation via three dimensions: 1) We aim to mimic a common mode of human interaction for recommendation: experts justify their suggestions, a seeker explains why they don’t like the item, and both parties iterate through the dialog to find a suitable item. 2) We leverage ideas from conversational critiquing to allow users to flexibly interact with natural language justifications by critiquing subjective aspects. 3) We adapt conversational recommendation to a wider range of domains where crowd-sourced ground truth dialogs are not available. We develop a new two-part framework for training conversational recommender systems. First, we train a recommender system to jointly suggest items and justify its reasoning with subjective aspects. We then fine-tune this model to incorporate iterative user feedback via self-supervised bot-play. Experiments on three real-world datasets demonstrate that our system can be applied to different recommendation models across diverse domains to achieve superior performance in conversational recommendation compared to state-of-the-art methods. We also evaluate our model on human users, showing that systems trained under our framework provide more useful, helpful, and knowledgeable recommendations in warm- and cold-start settings.

The Large Language Model Bible

Self-supervised Bot Play For Conversational Recommendation With Justifications

Li Shuyang, Majumder Bodhisattwa Prasad, Mcauley Julian. Arxiv 2021

Similar Work