[Paper]
In this paper, we introduce a novel Artificial Intelligence (AI) system
inspired by the philosophical and psychoanalytical concept of imagination as a
Re-construction of Experiences". Our AI system is equipped with an
imagination-inspired module that bridges the gap between textual inputs and
other modalities, enriching the derived information based on previously learned
experiences. A unique feature of our system is its ability to formulate
independent perceptions of inputs. This leads to unique interpretations of a
concept that may differ from human interpretations but are equally valid, a
phenomenon we term as
Interpretable Misunderstanding”. We employ large-scale
models, specifically a Multimodal Large Language Model (MLLM), enabling our
proposed system to extract meaningful information across modalities while
primarily remaining unimodal. We evaluated our system against other large
language models across multiple tasks, including emotion recognition and
question-answering, using a zero-shot methodology to ensure an unbiased
scenario that may happen by fine-tuning. Significantly, our system outperformed
the best Large Language Models (LLM) on the MELD, IEMOCAP, and CoQA datasets,
achieving Weighted F1 (WF1) scores of 46.74%, 25.23%, and Overall F1 (OF1)
score of 17%, respectively, compared to 22.89%, 12.28%, and 7% from the
well-performing LLM. The goal is to go beyond the statistical view of language
processing and tie it to human concepts such as philosophy and psychoanalysis.
This work represents a significant advancement in the development of
imagination-inspired AI systems, opening new possibilities for AI to generate
deep and interpretable information across modalities, thereby enhancing
human-AI interaction.