Cued@wmt19:ewc&lms · The Large Language Model Bible Contribute to LLM-Bible

Cued@wmt19:ewc&lms

Stahlberg Felix, Saunders Danielle, De Gispert Adria, Byrne Bill. Arxiv 2019

[Paper]    
Fine Tuning Model Architecture Pretraining Methods RAG Training Techniques Transformer

Two techniques provide the fabric of the Cambridge University Engineering Department’s (CUED) entry to the WMT19 evaluation campaign: elastic weight consolidation (EWC) and different forms of language modelling (LMs). We report substantial gains by fine-tuning very strong baselines on former WMT test sets using a combination of checkpoint averaging and EWC. A sentence-level Transformer LM and a document-level LM based on a modified Transformer architecture yield further gains. As in previous years, we also extract \(n\)-gram probabilities from SMT lattices which can be seen as a source-conditioned \(n\)-gram LM.

Similar Work