SLM Meets LLM: Balancing Latency, Interpretability And Consistency In Hallucination Detection

Hu Mengya, Xu Rui, Lei Deren, Li Yaxi, Wang Mingyu, Ching Emily, Kamal Eslam, Deng Alex. Arxiv 2024

[Paper]
Applications Interpretability And Explainability Prompting RAG Tools

Large language models (LLMs) are highly capable but face latency challenges in real-time applications, such as conducting online hallucination detection. To overcome this issue, we propose a novel framework that leverages a small language model (SLM) classifier for initial detection, followed by a LLM as constrained reasoner to generate detailed explanations for detected hallucinated content. This study optimizes the real-time interpretable hallucination detection by introducing effective prompting techniques that align LLM-generated explanations with SLM decisions. Empirical experiment results demonstrate its effectiveness, thereby enhancing the overall user experience.

The Large Language Model Bible

SLM Meets LLM: Balancing Latency, Interpretability And Consistency In Hallucination Detection

Hu Mengya, Xu Rui, Lei Deren, Li Yaxi, Wang Mingyu, Ching Emily, Kamal Eslam, Deng Alex. Arxiv 2024

Similar Work