DSGPT: Domain-specific Generative Pre-training Of Transformers For Text Generation In E-commerce Title And Review Summarization · The Large Language Model Bible Contribute to LLM-Bible

DSGPT: Domain-specific Generative Pre-training Of Transformers For Text Generation In E-commerce Title And Review Summarization

Zhang Xueying, Jiang Yunjiang, Shang Yue, Cheng Zhaomeng, Zhang Chi, Fan Xiaochuan, Xiao Yun, Long Bo. SIGIR 2021

[Paper]    
Applications Fine Tuning GPT Language Modeling Model Architecture Pretraining Methods Survey Paper Training Techniques Transformer

We propose a novel domain-specific generative pre-training (DS-GPT) method for text generation and apply it to the product titleand review summarization problems on E-commerce mobile display.First, we adopt a decoder-only transformer architecture, which fitswell for fine-tuning tasks by combining input and output all to-gether. Second, we demonstrate utilizing only small amount of pre-training data in related domains is powerful. Pre-training a languagemodel from a general corpus such as Wikipedia or the CommonCrawl requires tremendous time and resource commitment, andcan be wasteful if the downstream tasks are limited in variety. OurDSGPT is pre-trained on a limited dataset, the Chinese short textsummarization dataset (LCSTS). Third, our model does not requireproduct-related human-labeled data. For title summarization task,the state of art explicitly uses additional background knowledgein training and predicting stages. In contrast, our model implic-itly captures this knowledge and achieves significant improvementover other methods, after fine-tuning on the public Taobao.comdataset. For review summarization task, we utilize JD.com in-housedataset, and observe similar improvement over standard machinetranslation methods which lack the flexibility of fine-tuning. Ourproposed work can be simply extended to other domains for a widerange of text generation tasks.

Similar Work