Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet

M. Onat Topal, Anil Bas, Imke Van Heerden. Arxiv 2021 – 51 citations

[Paper]
Large-Scale Training GPT Transformer Attention Mechanism Tools BERT Language Modeling Model Architecture

Recent years have seen a proliferation of attention mechanisms and the rise of Transformers in Natural Language Generation (NLG). Previously, state-of-the-art NLG architectures such as RNN and LSTM ran into vanishing gradient problems; as sentences grew larger, distance between positions remained linear, and sequential computation hindered parallelization since sentences were processed word by word. Transformers usher in a new era. In this paper, we explore three major Transformer-based models, namely GPT, BERT, and XLNet, that carry significant implications for the field. NLG is a burgeoning area that is now bolstered with rapid developments in attention mechanisms. From poetry generation to summarization, text generation derives benefit as Transformer-based language models achieve groundbreaking results.

The Large Language Model Bible

Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet

M. Onat Topal, Anil Bas, Imke Van Heerden. Arxiv 2021 – 51 citations

Similar Work