Hallucination Detection For Grounded Instruction Generation

Zhao Lingjun, Nguyen Khanh, Daumé Hal Iii. Arxiv 2023

[Paper]
Fine Tuning Model Architecture Multimodal Models Pretraining Methods Training Techniques Transformer

We investigate the problem of generating instructions to guide humans to navigate in simulated residential environments. A major issue with current models is hallucination: they generate references to actions or objects that are inconsistent with what a human follower would perform or encounter along the described path. We develop a model that detects these hallucinated references by adopting a model pre-trained on a large corpus of image-text pairs, and fine-tuning it with a contrastive loss that separates correct instructions from instructions containing synthesized hallucinations. Our final model outperforms several baselines, including using word probability estimated by the instruction-generation model, and supervised models based on LSTM and Transformer.

The Large Language Model Bible

Hallucination Detection For Grounded Instruction Generation

Zhao Lingjun, Nguyen Khanh, Daumé Hal Iii. Arxiv 2023

Similar Work