A Data Generation Perspective To The Mechanism Of In-context Learning

Mao Haitao, Liu Guangliang, Ma Yao, Wang Rongrong, Johnson Kristen, Tang Jiliang. Arxiv 2024

[Paper]
In Context Learning Prompting RAG Reinforcement Learning Survey Paper

In-Context Learning (ICL) empowers Large Language Models (LLMs) with the capacity to learn in context, achieving downstream generalization without gradient updates but with a few in-context examples. Despite the encouraging empirical success, the underlying mechanism of ICL remains unclear, and existing research offers various viewpoints of understanding. These studies propose intuition-driven and ad-hoc technical solutions for interpreting ICL, illustrating an ambiguous road map. In this paper, we leverage a data generation perspective to reinterpret recent efforts and demonstrate the potential broader usage of popular technical solutions, approaching a systematic angle. For a conceptual definition, we rigorously adopt the terms of skill learning and skill recognition. The difference between them is skill learning can learn new data generation functions from in-context data. We also provide a comprehensive study on the merits and weaknesses of different solutions, and highlight the uniformity among them given the perspective of data generation, establishing a technical foundation for future research to incorporate the strengths of different lines of research.

The Large Language Model Bible

A Data Generation Perspective To The Mechanism Of In-context Learning

Mao Haitao, Liu Guangliang, Ma Yao, Wang Rongrong, Johnson Kristen, Tang Jiliang. Arxiv 2024

Similar Work