Open-source LLMs have shown great potential as fine-tuned chatbots, and
demonstrate robust abilities in reasoning and surpass many existing benchmarks.
Retrieval-Augmented Generation (RAG) is a technique for improving the
performance of LLMs on tasks that the models weren’t explicitly trained on, by
leveraging external knowledge databases. Numerous studies have demonstrated the
effectiveness of RAG to more successfully accomplish downstream tasks when
using vector datasets that consist of relevant background information. It has
been implicitly assumed by those in the field that if adversarial background
information is utilized in this context, that the success of using a RAG-based
approach would be nonexistent or even negatively impact the results. To address
this assumption, we tested several open-source LLMs on the ability of RAG to
improve their success in answering multiple-choice questions (MCQ) in the
medical subspecialty field of Nephrology. Unlike previous studies, we examined
the effect of RAG in utilizing both relevant and adversarial background
databases. We set up several open-source LLMs, including Llama 3, Phi-3,
Mixtral 8x7b, Zephyr