CAUS: A Dataset For Question Generation Based On Human Cognition Leveraging Large Language Models · The Large Language Model Bible Contribute to LLM-Bible

CAUS: A Dataset For Question Generation Based On Human Cognition Leveraging Large Language Models

Shin Minjung, Kim Donghyun, Ryu Jeh-kwang. Arxiv 2024

[Paper]    
GPT Model Architecture RAG Reinforcement Learning

We introduce the Curious About Uncertain Scene (CAUS) dataset, designed to enable Large Language Models, specifically GPT-4, to emulate human cognitive processes for resolving uncertainties. Leveraging this dataset, we investigate the potential of LLMs to engage in questioning effectively. Our approach involves providing scene descriptions embedded with uncertainties to stimulate the generation of reasoning and queries. The queries are then classified according to multi-dimensional criteria. All procedures are facilitated by a collaborative system involving both LLMs and human researchers. Our results demonstrate that GPT-4 can effectively generate pertinent questions and grasp their nuances, particularly when given appropriate context and instructions. The study suggests that incorporating human-like questioning into AI models improves their ability to manage uncertainties, paving the way for future advancements in Artificial Intelligence (AI).

Similar Work