Chipnemo: Domain-adapted Llms For Chip Design · The Large Language Model Bible Contribute to LLM-Bible

Chipnemo: Domain-adapted Llms For Chip Design

Liu Mingjie, Ene Teodor-dumitru, Kirby Robert, Cheng Chris, Pinckney Nathaniel, Liang Rongjian, Alben Jonah, Anand Himyanshu, Banerjee Sanmitra, Bayraktaroglu Ismet, Bhaskaran Bonita, Catanzaro Bryan, Chaudhuri Arjun, Clay Sharon, Dally Bill, Dang Laura, Deshpande Parikshit, Dhodhi Siddhanth, Halepete Sameer, Hill Eric, Hu Jiashang, Jain Sumit, Jindal Ankit, Khailany Brucek, Kokai George, Kunal Kishor, Li Xiaowei, Lind Charley, Liu Hao, Oberman Stuart, Omar Sujeet, Pasandi Ghasem, Pratty Sreedhar, Raiman Jonathan, Sarkar Ambar, Shao Zhengjiang, Sun Hanfei, Suthar Pratik P, Tej Varun, Turner Walker, Xu Kaizhe, Ren Haoxing. Arxiv 2023

[Paper]    
Applications Fine Tuning GPT Model Architecture Pretraining Methods Tokenization Training Techniques

ChipNeMo aims to explore the applications of large language models (LLMs) for industrial chip design. Instead of directly deploying off-the-shelf commercial or open-source LLMs, we instead adopt the following domain adaptation techniques: domain-adaptive tokenization, domain-adaptive continued pretraining, model alignment with domain-specific instructions, and domain-adapted retrieval models. We evaluate these methods on three selected LLM applications for chip design: an engineering assistant chatbot, EDA script generation, and bug summarization and analysis. Our evaluations demonstrate that domain-adaptive pretraining of language models, can lead to superior performance in domain related downstream tasks compared to their base LLaMA2 counterparts, without degradations in generic capabilities. In particular, our largest model, ChipNeMo-70B, outperforms the highly capable GPT-4 on two of our use cases, namely engineering assistant chatbot and EDA scripts generation, while exhibiting competitive performance on bug summarization and analysis. These results underscore the potential of domain-specific customization for enhancing the effectiveness of large language models in specialized applications.

Similar Work