Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet · The Large Language Model Bible Contribute to LLM-Bible

Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet

Topal M. Onat, Bas Anil, Van Heerden Imke. Arxiv 2021

[Paper]    
Applications Attention Mechanism BERT GPT Language Modeling Large Scale Training Model Architecture Pretraining Methods Tools Transformer

Recent years have seen a proliferation of attention mechanisms and the rise of Transformers in Natural Language Generation (NLG). Previously, state-of-the-art NLG architectures such as RNN and LSTM ran into vanishing gradient problems; as sentences grew larger, distance between positions remained linear, and sequential computation hindered parallelization since sentences were processed word by word. Transformers usher in a new era. In this paper, we explore three major Transformer-based models, namely GPT, BERT, and XLNet, that carry significant implications for the field. NLG is a burgeoning area that is now bolstered with rapid developments in attention mechanisms. From poetry generation to summarization, text generation derives benefit as Transformer-based language models achieve groundbreaking results.

Similar Work