Macgyver: Are Large Language Models Creative Problem Solvers?

Tian Yufei, Ravichander Abhilasha, Qin Lianhui, Bras Ronan Le, Marjieh Raja, Peng Nanyun, Choi Yejin, Griffiths Thomas L., Brahman Faeze. Arxiv 2023

[Paper]
Agentic Prompting Reinforcement Learning Tools

We explore the creative problem-solving capabilities of modern LLMs in a novel constrained setting. To this end, we create MACGYVER, an automatically generated dataset consisting of over 1,600 real-world problems deliberately designed to trigger innovative usage of objects and necessitate out-of-the-box thinking. We then present our collection to both LLMs and humans to compare and contrast their problem-solving abilities. MACGYVER is challenging for both groups, but in unique and complementary ways. For instance, humans excel in tasks they are familiar with but struggle with domain-specific knowledge, leading to a higher variance. In contrast, LLMs, exposed to a variety of specialized knowledge, attempt broader problems but fail by proposing physically-infeasible actions. Finally, we provide a detailed error analysis of LLMs, and demonstrate the potential of enhancing their problem-solving ability with novel prompting techniques such as iterative step-wise reflection and divergent-convergent thinking. This work (1) introduces a fresh arena for intelligent agents focusing on intricate aspects of physical reasoning, planning, and unconventional thinking, which supplements the existing spectrum of machine intelligence; and (2) provides insight into the constrained problem-solving capabilities of both humans and AI.

The Large Language Model Bible

Macgyver: Are Large Language Models Creative Problem Solvers?

Tian Yufei, Ravichander Abhilasha, Qin Lianhui, Bras Ronan Le, Marjieh Raja, Peng Nanyun, Choi Yejin, Griffiths Thomas L., Brahman Faeze. Arxiv 2023

Similar Work