Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models · The Large Language Model Bible Contribute to LLM-Bible

Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models

Lin Sheng-chieh, Yang Jheng-hong, Nogueira Rodrigo, Tsai Ming-feng, Wang Chuan-ju, Lin Jimmy. Arxiv 2020

[Paper]    
Model Architecture Pretraining Methods RAG Transformer

This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs). We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task. In CQR benchmarks of task-oriented dialogue systems, we evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task. Examining a variety of architectures with different numbers of parameters, we demonstrate that the recent text-to-text transfer transformer (T5) achieves the best results both on CANARD and CAsT with fewer parameters, compared to similar transformer architectures.

Similar Work