FACTS About Building Retrieval Augmented Generation-based Chatbots · The Large Language Model Bible Contribute to LLM-Bible

FACTS About Building Retrieval Augmented Generation-based Chatbots

Akkiraju Rama, Xu Anbang, Bora Deepak, Yu Tan, An Lu, Seth Vishal, Shukla Aaditya, Gundecha Pritam, Mehta Hridhay, Jha Ashwin, Raj Prithvi, Balasubramanian Abhinav, Maram Murali, Muthusamy Guru, Annepally Shivakesh Reddy, Knowles Sidney, Du Min, Burnett Nick, Javiya Sean, Marannan Ashok, Kumari Mamta, Jha Surbhi, Dereszenski Ethan, Chakraborty Anupam, Ranjan Subhash, Terfai Amina, Surya Anoop, Mercer Tracey, Thanigachalam Vinodh Kumar, Bar Tamar, Krishnan Sanjana, Kilaru Samy, Jaksic Jasmine, Algarici Nave, Liberman Jacob, Conway Joey, Nayyar Sonu, Boitano Justin. Arxiv 2024

[Paper]    
Agentic Applications Fine Tuning Merging Model Architecture Pretraining Methods Prompting RAG Security Tools Training Techniques

Enterprise chatbots, powered by generative AI, are emerging as key applications to enhance employee productivity. Retrieval Augmented Generation (RAG), Large Language Models (LLMs), and orchestration frameworks like Langchain and Llamaindex are crucial for building these chatbots. However, creating effective enterprise chatbots is challenging and requires meticulous RAG pipeline engineering. This includes fine-tuning embeddings and LLMs, extracting documents from vector databases, rephrasing queries, reranking results, designing prompts, honoring document access controls, providing concise responses, including references, safeguarding personal information, and building orchestration agents. We present a framework for building RAG-based chatbots based on our experience with three NVIDIA chatbots: for IT/HR benefits, financial earnings, and general content. Our contributions are three-fold: introducing the FACTS framework (Freshness, Architectures, Cost, Testing, Security), presenting fifteen RAG pipeline control points, and providing empirical results on accuracy-latency tradeoffs between large and small LLMs. To the best of our knowledge, this is the first paper of its kind that provides a holistic view of the factors as well as solutions for building secure enterprise-grade chatbots.”

Similar Work