Divide Et Impera: Multi-transformer Architectures For Complex Nlp-tasks · The Large Language Model Bible Contribute to LLM-Bible

Divide Et Impera: Multi-transformer Architectures For Complex Nlp-tasks

Helland Solveig, Gavagnin Elena, De Spindler Alexandre. Proceedings of the 2023

[Paper]    
Ethics And Bias Fine Tuning Model Architecture Pretraining Methods Reinforcement Learning Training Techniques Transformer

The growing capabilities of transformer models pave the way for solving increasingly complex NLP tasks. A key to supporting application-specific requirements is the ability to fine-tune. However, compiling a fine-tuning dataset tailored to complex tasks is tedious and results in large datasets, limiting the ability to control transformer output. We present an approach in which complex tasks are divided into simpler subtasks. Multiple transformer models are fine-tuned to one subtask each, and lined up to accomplish the complex task. This simplifies the compilation of fine-tuning datasets and increases overall controllability. Using the example of reducing gender bias as a complex task, we demonstrate our approach and show that it performs better than using a single model.

Similar Work