Generative AI As A Metacognitive Agent: A Comparative Mixed-method Study With Human Participants On Icf-mimicking Exam Performance

Pavlovic Jelena University Of Belgrade, Faculty Of Philosophy And Koucing Centar Resarch Lab, Krstic Jugoslav Koucing Centar Research Lab, Mitrovic Luka Koucing Centar Research Lab, Babic Djordje Koucing Centar Research Lab, Milosavljevic Adrijana Koucing Centar Research Lab, Nikolic Milena Koucing Centar Research Lab, Karaklic Tijana Koucing Centar Research Lab, Mitrovic Tijana Koucing Centar Research Lab. Arxiv 2024

[Paper]
Agentic Ethics And Bias GPT Model Architecture Reinforcement Learning Tools

This study investigates the metacognitive capabilities of Large Language Models relative to human metacognition in the context of the International Coaching Federation ICF mimicking exam, a situational judgment test related to coaching competencies. Using a mixed method approach, we assessed the metacognitive performance, including sensitivity, accuracy in probabilistic predictions, and bias, of human participants and five advanced LLMs (GPT-4, Claude-3-Opus 3, Mistral Large, Llama 3, and Gemini 1.5 Pro). The results indicate that LLMs outperformed humans across all metacognitive metrics, particularly in terms of reduced overconfidence, compared to humans. However, both LLMs and humans showed less adaptability in ambiguous scenarios, adhering closely to predefined decision frameworks. The study suggests that Generative AI can effectively engage in human-like metacognitive processing without conscious awareness. Implications of the study are discussed in relation to development of AI simulators that scaffold cognitive and metacognitive aspects of mastering coaching competencies. More broadly, implications of these results are discussed in relation to development of metacognitive modules that lead towards more autonomous and intuitive AI systems.

The Large Language Model Bible

Generative AI As A Metacognitive Agent: A Comparative Mixed-method Study With Human Participants On Icf-mimicking Exam Performance

Similar Work