Lifelongmemory: Leveraging Llms For Answering Queries In Long-form Egocentric Videos · The Large Language Model Bible Contribute to LLM-Bible

Lifelongmemory: Leveraging Llms For Answering Queries In Long-form Egocentric Videos

Wang Ying, Yang Yanlai, Ren Mengye. Arxiv 2023

[Paper] [Code]    
Agentic Has Code Interpretability And Explainability RAG Tools

In this paper we introduce LifelongMemory, a new framework for accessing long-form egocentric videographic memory through natural language question answering and retrieval. LifelongMemory generates concise video activity descriptions of the camera wearer and leverages the zero-shot capabilities of pretrained large language models to perform reasoning over long-form video context. Furthermore, Lifelong Memory uses a confidence and explanation module to produce confident, high-quality, and interpretable answers. Our approach achieves state-of-the-art performance on the EgoSchema benchmark for question answering and is highly competitive on the natural language query (NLQ) challenge of Ego4D. Code is available at https://github.com/Agentic-Learning-AI-Lab/lifelong-memory.

Similar Work