Attention Is Indeed All You Need: Semantically Attention-guided Decoding For Data-to-text NLG

Juraska Juraj, Walker Marilyn. Arxiv 2021

[Paper]
Attention Mechanism Model Architecture

Ever since neural models were adopted in data-to-text language generation, they have invariably been reliant on extrinsic components to improve their semantic accuracy, because the models normally do not exhibit the ability to generate text that reliably mentions all of the information provided in the input. In this paper, we propose a novel decoding method that extracts interpretable information from encoder-decoder models’ cross-attention, and uses it to infer which attributes are mentioned in the generated text, which is subsequently used to rescore beam hypotheses. Using this decoding method with T5 and BART, we show on three datasets its ability to dramatically reduce semantic errors in the generated outputs, while maintaining their state-of-the-art quality.

The Large Language Model Bible

Attention Is Indeed All You Need: Semantically Attention-guided Decoding For Data-to-text NLG

Juraska Juraj, Walker Marilyn. Arxiv 2021

Similar Work