Publications by Tag
The following tags appear in the publications listed in the review:
ACL Agentic Applications Arxiv Attention Mechanism BERT Bias Mitigation COLING Dataset Distillation Efficiency and Optimization EMNLP Ethics and Bias Fairness Few-Shot Fine-Tuning GPT Has Code ICLR ICML In-Context Learning Interpretability and Explainability INTERSPEECH KDD Language Modeling Large-Scale Training LREC Masked Language Model Merging Model Architecture Multimodal Models NeurIPS Pre-Training Prompting Pruning Quantization RAG RecSys Reinforcement Learning Responsible AI Scaling Laws Security SLT Survey Paper TACL Tokenization Tools Training Techniques Transformer Uncategorized Vector Indexing WMT
Tags
See below a list of all tags and the related papers
🏷 ACL
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Training Neural Response Selection For Task-oriented Dialogue Systems Matthew Henderson et al.
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Few-shot Conversational Dense Retrieval Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Calibrated Language Models Must Hallucinate Adam Tauman Kalai, Santosh S. Vempala
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
🏷 Agentic
- Deep Active Learning For Dialogue Generation Nabiha Asghar, Pascal Poupart, Xin Jiang, Hang Li
- A Simple, Fast Diverse Decoding Algorithm For Neural Generation Jiwei Li, Will Monroe, Dan Jurafsky
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Deep Reinforcement Learning For Dialogue Generation Jiwei Li et al.
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Ask The Right Questions: Active Question Reformulation With Reinforcement Learning Christian Buck et al.
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- R\(^3\): Reinforced Reader-ranker For Open-domain Question Answering Shuohang Wang et al.
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Iris: A Conversational Agent For Complex Tasks Ethan Fast, Binbin Chen, Julia Mendelsohn, Jonathan Bassen, Michael Bernstein
- Grounding Language For Transfer In Deep Reinforcement Learning Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- Wizard Of Wikipedia: Knowledge-powered Conversational Agents Emily Dinan et al.
- Towards Explainable And Controllable Open Domain Dialogue Generation With Dialogue Acts Can Xu, Wei Wu, Yu Wu
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- Building A Conversational Agent Overnight With Dialogue Self-play Pararth Shah et al.
- On Evaluating And Comparing Open Domain Dialog Systems Anu Venkatesh et al.
- Guiding Policies With Language Via Meta-learning John D. Co-reyes et al.
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Robust Navigation With Language Pretraining And Stochastic Sampling Xiujun Li et al.
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Caire: An Empathetic Neural Chatbot Zhaojiang Lin et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Personalizing Dialogue Agents Via Meta-learning Zhaojiang Lin, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- Language As An Abstraction For Hierarchical Deep Reinforcement Learning Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Learning From Dialogue After Deployment: Feed Yourself, Chatbot! Braden Hancock, Antoine Bordes, Pierre-emmanuel Mazaré, Jason Weston
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Deep Learning Based Chatbot Models Richard Csaky
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- Towards Scalable Multi-domain Conversational Agents: The Schema-guided Dialogue Dataset Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Consistent Dialogue Generation With Self-supervised Feature Learning Yizhe Zhang et al.
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Reinforcement Learning Based Emotional Editing Constraint Conversation Generation Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jianhua Tao
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Countering Language Drift With Seeded Iterated Learning Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- Low-resource Knowledge-grounded Dialogue Generation Xueliang Zhao et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Collaborative Storytelling With Large-scale Neural Language Models Eric Nichols, Leo Gao, Randy Gomez
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Internet-augmented Dialogue Generation Mojtaba Komeili, Kurt Shuster, Jason Weston
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Language Models As Agent Models Jacob Andreas
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Planbench: An Extensible Benchmark For Evaluating Large Language Models On Planning And Reasoning About Change Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati
- Evolution Through Large Models Joel Lehman et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Contrastive Learning Reduces Hallucination In Conversations Weiwei Sun et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Qwen Technical Report Jinze Bai et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- March In Chat: Interactive Prompting For Remote Embodied Referring Expression Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 Applications
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Neural Text Generation: A Practical Guide Ziang Xie
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Measuring And Reducing Gendered Correlations In Pre-trained Models Kellie Webster et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- DS-1000: A Natural And Reliable Benchmark For Data Science Code Generation Yuhang Lai et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Qwen Technical Report Jinze Bai et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Large Language Models Cannot Self-correct Reasoning Yet Jie Huang et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- Challenges And Applications Of Large Language Models Jean Kaddour et al.
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Git-mol: A Multi-modal Large Language Model For Molecular Science With Graph, Image, And Text Pengfei Liu, Yiming Ren, Jun Tao, Zhixiang Ren
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Evallm: Interactive Evaluation Of Large Language Model Prompts On User-defined Criteria Tae Soo Kim, Yoonjoo Lee, Jamin Shin, Young-ho Kim, Juho Kim
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Show-1: Marrying Pixel And Latent Diffusion Models For Text-to-video Generation David Junhao Zhang et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Code Llama: Open Foundation Models For Code Baptiste Rozière et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- The (ab)use Of Open Source Code To Train Large Language Models Ali Al-kaswan, Maliheh Izadi
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Large Language Models And Games: A Survey And Roadmap Roberto Gallotta et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Recent Advances In Generative AI And Large Language Models: Current Status, Challenges, And Perspectives Desta Haileselassie Hagos, Rick Battle, Danda B. Rawat
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
🏷 Arxiv
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
🏷 Attention Mechanism
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Attention Is All You Need Ashish Vaswani et al.
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Sequence-to-sequence Learning For Task-oriented Dialogue With Dialogue State Representation Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu
- Seq2rdf: An End-to-end Application For Deriving Triples From Natural Language Text Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. Mcguinness
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Pervasive Attention: 2D Convolutional Neural Networks For Sequence-to-sequence Prediction Maha Elbayad, Laurent Besacier, Jakob Verbeek
- Hierarchical Neural Story Generation Angela Fan, Mike Lewis, Yann Dauphin
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Talking-heads Attention Noam Shazeer, Zhenzhong Lan, Youlong Cheng, Nan Ding, Le Hou
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Rikinet: Reading Wikipedia Pages For Natural Question Answering Dayiheng Liu et al.
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- A Length-extrapolatable Transformer Yutao Sun et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Label Supervised Llama Finetuning Zongxi Li et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Starcoder: May The Source Be With You! Raymond Li et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Show-1: Marrying Pixel And Latent Diffusion Models For Text-to-video Generation David Junhao Zhang et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- Mistral 7B Albert Q. Jiang et al.
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
🏷 BERT
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Myers-briggs Personality Classification And Personality-specific Language Generation Using Pre-trained Language Models Sedrick Scott Keh, I-tsun Cheng
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Passage Re-ranking With BERT Rodrigo Nogueira, Kyunghyun Cho
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Linking Artificial And Human Neural Representations Of Language Jon Gauthier, Roger Levy
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Camembert: A Tasty French Language Model Louis Martin et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- BERT Has A Mouth, And It Must Speak: BERT As A Markov Random Field Language Model Alex Wang, Kyunghyun Cho
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Boolq: Exploring The Surprising Difficulty Of Natural Yes/no Questions Christopher Clark et al.
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- GRUEN For Evaluating Linguistic Quality Of Generated Text Wanzheng Zhu, Suma Bhat
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Contrastive Code Representation Learning Paras Jain et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Pre-training Via Paraphrasing Mike Lewis et al.
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Label Supervised Llama Finetuning Zongxi Li et al.
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Bias Mitigation
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Holistic Evaluation Of Language Models Percy Liang et al.
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 COLING
🏷 Dataset
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
🏷 Distillation
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
🏷 Efficiency and Optimization
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Real-time Open-domain Question Answering With Dense-sparse Phrase Index Minjoon Seo et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Imitation Attacks And Defenses For Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- UNICORN On RAINBOW: A Universal Commonsense Reasoning Model On A New Multitask Benchmark Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Learned Token Pruning For Transformers Sehoon Kim et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Emergent Abilities Of Large Language Models Jason Wei et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- Contrastive Learning Reduces Hallucination In Conversations Weiwei Sun et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Palm 2 Technical Report Rohan Anil et al.
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- GPT-4 Technical Report Openai et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Detecting And Preventing Hallucinations In Large Vision Language Models Anisha Gunjal, Jihan Yin, Erhan Bas
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mistral 7B Albert Q. Jiang et al.
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Powerinfer: Fast Large Language Model Serving With A Consumer-grade GPU Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
🏷 EMNLP
🏷 Ethics and Bias
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Language Gans Falling Short Massimo Caccia et al.
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Modifying Memories In Transformer Models Chen Zhu et al.
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- On Second Thought, Let's Not Think Step By Step! Bias And Toxicity In Zero-shot Reasoning Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- Chain-of-verification Reduces Hallucination In Large Language Models Shehzaad Dhuliawala et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Palm 2 Technical Report Rohan Anil et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Starcoder: May The Source Be With You! Raymond Li et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Creating Trustworthy Llms: Dealing With Hallucinations In Healthcare AI Muhammad Aurangzeb Ahmad, Ilker Yaramis, Taposh Dutta Roy
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Bias Of Ai-generated Content: An Examination Of News Produced By Large Language Models Xiao Fang et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Fairness
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Holistic Evaluation Of Language Models Percy Liang et al.
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 Few-Shot
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Language Models Are Few-shot Learners Tom B. Brown et al.
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- The Turking Test: Can Language Models Understand Instructions? Avia Efrat, Omer Levy
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Grounded Language-image Pre-training Liunian Harold Li et al.
- PTR: Prompt Tuning With Rules For Text Classification Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Program Synthesis With Large Language Models Jacob Austin et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Do Prompt-based Models Really Understand The Meaning Of Their Prompts? Albert Webson, Ellie Pavlick
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- GPT Understands, Too Xiao Liu et al.
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- Constrained Language Models Yield Few-shot Semantic Parsers Richard Shin et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Few-shot Conversational Dense Retrieval Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Atlas: Few-shot Learning With Retrieval Augmented Language Models Gautier Izacard et al.
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Towards Using Few-shot Prompt Learning For Automating Model Completion Meriem Ben Chaaben, Lola Burgueño, Houari Sahraoui
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Language Model Cascades David Dohan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- Sentence Simplification Via Large Language Models Yutao Feng, Jipeng Qiang, Yun Li, Yunhao Yuan, Yi Zhu
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
🏷 Fine-Tuning
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Neural Personalized Response Generation As Domain Adaptation Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Language Models As Knowledge Bases? Fabio Petroni et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Boolq: Exploring The Surprising Difficulty Of Natural Yes/no Questions Christopher Clark et al.
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Modifying Memories In Transformer Models Chen Zhu et al.
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Exploring Versatile Generative Language Model Via Parameter-efficient Transfer Learning Zhaojiang Lin, Andrea Madotto, Pascale Fung
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Pre-training Via Paraphrasing Mike Lewis et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- UNICORN On RAINBOW: A Universal Commonsense Reasoning Model On A New Multitask Benchmark Nicholas Lourie, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Spot: Better Frozen Model Adaptation Through Soft Prompt Transfer Tu Vu, Brian Lester, Noah Constant, Rami Al-rfou, Daniel Cer
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- XTREME-R: Towards More Challenging And Nuanced Multilingual Evaluation Sebastian Ruder et al.
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- A Simple Recipe For Multilingual Grammatical Error Correction Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Uni-perceiver V2: A Generalist Model For Large-scale Vision And Vision-language Tasks Hao Li et al.
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Large Language Models Can Self-improve Jiaxin Huang et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Visual Prompt Tuning Menglin Jia et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Noisytune: A Little Noise Can Help You Finetune Pretrained Language Models Better Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Lamini-lm: A Diverse Herd Of Distilled Models From Large-scale Instructions Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed, Alham Fikri Aji
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Privacy In Large Language Models: Attacks, Defenses And Future Directions Haoran Li et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Timechat: A Time-sensitive Multimodal Large Language Model For Long Video Understanding Shuhuai Ren, Linli Yao, Shicheng Li, Xu Sun, Lu Hou
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Luminate: Structured Generation And Exploration Of Design Space With Large Language Models For Human-ai Co-creation Sangho Suh, Meng Chen, Bryan Min, Toby Jia-jun Li, Haijun Xia
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Multitask Prompt Tuning Enables Parameter-efficient Transfer Learning Zhen Wang et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- One Adapter For All Programming Languages? Adapter Tuning For Code Search And Summarization Deze Wang et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Adaptmllm: Fine-tuning Multilingual Language Models On Low-resource Languages With Integrated LLM Playgrounds Séamus Lankford, Haithem Afli, Andy Way
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Can Generative Llms Create Query Variants For Test Collections? An Exploratory Study Marwah Alaofi, Luke Gallagher, Mark Sanderson, Falk Scholer, Paul Thomas
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 GPT
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Passage Re-ranking With BERT Rodrigo Nogueira, Kyunghyun Cho
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Gpt-based Generation For Classical Chinese Poetry Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Do Massively Pretrained Language Models Make Better Storytellers? Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Non-autoregressive Machine Translation With Latent Alignments Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Narrative Interpolation For Generating And Understanding Stories Su Wang, Greg Durrett, Katrin Erk
- Plotmachines: Outline-conditioned Generation With Dynamic Plot State Tracking Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Language Model As An Annotator: Exploring Dialogpt For Dialogue Summarization Xiachong Feng, Xiaocheng Feng, Libo Qin, Bing Qin, Ting Liu
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Understanding The Capabilities, Limitations, And Societal Impact Of Large Language Models Alex Tamkin, Miles Brundage, Jack Clark, Deep Ganguli
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- GPT Understands, Too Xiao Liu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Teaching Models To Express Their Uncertainty In Words Stephanie Lin, Jacob Hilton, Owain Evans
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- Evaluating Human-language Model Interaction Mina Lee et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Lamini-lm: A Diverse Herd Of Distilled Models From Large-scale Instructions Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed, Alham Fikri Aji
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Spear Phishing With Large Language Models Julian Hazell
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Llama: Open And Efficient Foundation Language Models Hugo Touvron et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Exploring The Psychology Of Llms' Moral And Legal Reasoning Guilherme F. C. F. Almeida, José Luiz Nunes, Neele Engelmann, Alex Wiegmann, Marcelo De Araújo
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Verigen: A Large Language Model For Verilog Code Generation Shailja Thakur et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Embers Of Autoregression: Understanding Large Language Models Through The Problem They Are Trained To Solve R. Thomas Mccoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Lawyer Llama Technical Report Quzhe Huang et al.
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Are Large Language Models Geospatially Knowledgeable? Prabin Bhandari, Antonios Anastasopoulos, Dieter Pfoser
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- GPT Has Become Financially Literate: Insights From Financial Literacy Tests Of GPT And A Preliminary Test Of How People Use It As A Source Of Advice Paweł Niszczota, Sami Abbas
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- GPT-4 Technical Report Openai et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Clever Hans Or Neural Theory Of Mind? Stress Testing Social Reasoning In Large Language Models Natalie Shapira et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- Is Chatgpt Equipped With Emotional Dialogue Capabilities? Weixiang Zhao et al.
- Layoutgpt: Compositional Visual Planning And Generation With Large Language Models Weixi Feng et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Bias Of Ai-generated Content: An Examination Of News Produced By Large Language Models Xiao Fang et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Learning To Reason Over Scene Graphs: A Case Study Of Finetuning GPT-2 Into A Robot Language Model For Grounded Task Planning Georgia Chalvatzaki et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- REFINER: Reasoning Feedback On Intermediate Representations Debjit Paul et al.
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Response: Emergent Analogical Reasoning In Large Language Models Damian Hodel, Jevin West
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Learning Gain Differences Between Chatgpt And Human Tutor Generated Algebra Hints Zachary A. Pardos, Shreya Bhandari
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
🏷 Has Code
- Triviaqa: A Large Scale Distantly Supervised Challenge Dataset For Reading Comprehension Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Passage Re-ranking With BERT Rodrigo Nogueira, Kyunghyun Cho
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- How Can We Know What Language Models Know? Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Detecting Hallucinated Content In Conditional Neural Sequence Generation Chunting Zhou et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Grounded Language-image Pre-training Liunian Harold Li et al.
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Efficient Passage Retrieval With Hashing For Open-domain Question Answering Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Summ^n: A Multi-stage Summarization Framework For Long Input Dialogues And Documents Yusen Zhang et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- XTREME-R: Towards More Challenging And Nuanced Multilingual Evaluation Sebastian Ruder et al.
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Compm: Context Modeling With Speaker's Pre-trained Memory Tracking For Emotion Recognition In Conversation Joosung Lee, Wooin Lee
- Towards Continual Knowledge Learning Of Language Models Joel Jang et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- Generated Knowledge Prompting For Commonsense Reasoning Jiacheng Liu et al.
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Few-shot Conversational Dense Retrieval Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Recitation-augmented Language Models Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- The Stack: 3 TB Of Permissively Licensed Source Code Denis Kocetkov et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Incoder: A Generative Model For Code Infilling And Synthesis Daniel Fried et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- DS-1000: A Natural And Reliable Benchmark For Data Science Code Generation Yuhang Lai et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Simple And Controllable Music Generation Jade Copet et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Generating With Confidence: Uncertainty Quantification For Black-box Large Language Models Zhen Lin, Shubhendu Trivedi, Jimeng Sun
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Codekgc: Code Language Model For Generative Knowledge Graph Construction Zhen Bi et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks Zhe Chen et al.
- Active Retrieval Augmented Generation Zhengbao Jiang et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Show-1: Marrying Pixel And Latent Diffusion Models For Text-to-video Generation David Junhao Zhang et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Element-aware Summarization With Large Language Models: Expert-aligned Evaluation And Chain-of-thought Method Yiming Wang, Zhuosheng Zhang, Rui Wang
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Evaluating Object Hallucination In Large Vision-language Models Yifan Li et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Specinfer: Accelerating Generative Large Language Model Serving With Tree-based Speculative Inference And Verification Xupeng Miao et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- The Effect Of Sampling Temperature On Problem Solving In Large Language Models Matthew Renze, Erhan Guven
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 ICLR
🏷 ICML
🏷 In-Context Learning
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
🏷 Interpretability and Explainability
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Abductive Commonsense Reasoning Chandra Bhagavatula et al.
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- On Hallucination And Predictive Uncertainty In Conditional Language Generation Yijun Xiao, William Yang Wang
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Automatic Generation Of Programming Exercises And Code Explanations Using Large Language Models Sami Sarsa, Paul Denny, Arto Hellas, Juho Leinonen
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
🏷 INTERSPEECH
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Prompting Large Language Models With Speech Recognition Abilities Yassir Fathullah et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
🏷 KDD
🏷 Language Modeling
- Neural Text Generation From Structured Data With Application To The Biography Domain Remi Lebret, David Grangier, Michael Auli
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Neural Text Generation: A Practical Guide Ziang Xie
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Pervasive Attention: 2D Convolutional Neural Networks For Sequence-to-sequence Prediction Maha Elbayad, Laurent Besacier, Jakob Verbeek
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Passage Re-ranking With BERT Rodrigo Nogueira, Kyunghyun Cho
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- ELI5: Long Form Question Answering Angela Fan et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Barack's Wife Hillary: Using Knowledge-graphs For Fact-aware Language Modeling Robert L. Iv Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Counterfactual Story Reasoning And Generation Lianhui Qin et al.
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Caire: An Empathetic Neural Chatbot Zhaojiang Lin et al.
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- Sentence-level Content Planning And Style Specification For Neural Text Generation Xinyu Hua, Lu Wang
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Commongen: A Constrained Text Generation Challenge For Generative Commonsense Reasoning Bill Yuchen Lin et al.
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Moverscore: Text Generation Evaluating With Contextualized Embeddings And Earth Mover Distance Wei Zhao et al.
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- Non-autoregressive Machine Translation With Latent Alignments Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Mathematical Reasoning Via Self-supervised Skip-tree Training Markus N. Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Pre-training Via Paraphrasing Mike Lewis et al.
- Investigating Pretrained Language Models For Graph-to-text Generation Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Generation-augmented Retrieval For Open-domain Question Answering Yuning Mao et al.
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- Parallel Refinements For Lexically Constrained Text Generation With BART Xingwei He
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MAUVE: Measuring The Gap Between Neural Text And Human Text Using Divergence Frontiers Krishna Pillutla et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Structural Adapters In Pretrained Language Models For Amr-to-text Generation Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Hindsight: Posterior-guided Training Of Retrievers For Improved Open-ended Generation Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning
- On Hallucination And Predictive Uncertainty In Conditional Language Generation Yijun Xiao, William Yang Wang
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Confident Adaptive Language Modeling Tal Schuster et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Calibrating Sequence Likelihood Improves Conditional Language Generation Yao Zhao et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Large Language Models Cannot Self-correct Reasoning Yet Jie Huang et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Chain-of-verification Reduces Hallucination In Large Language Models Shehzaad Dhuliawala et al.
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- How Useful Are Educational Questions Generated By Large Language Models? Sabina Elkins, Ekaterina Kochmar, Jackie C. K. Cheung, Iulian Serban
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Macaw-llm: Multi-modal Language Modeling With Image, Audio, Video, And Text Integration Chenyang Lyu et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
🏷 Large-Scale Training
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- AI And Memory Wall Amir Gholami et al.
🏷 LREC
🏷 Masked Language Model
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Span Selection Pre-training For Question Answering Michael Glass et al.
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Pre-training Via Paraphrasing Mike Lewis et al.
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
🏷 Merging
- Attention Is All You Need Ashish Vaswani et al.
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Hierarchical Neural Story Generation Angela Fan, Mike Lewis, Yann Dauphin
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Jointly Optimizing Diversity And Relevance In Neural Response Generation Xiang Gao et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Knowledge Aware Conversation Generation With Explainable Reasoning Over Augmented Graphs Zhibin Liu, Zheng-yu Niu, Hua Wu, Haifeng Wang
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Diffuseq: Sequence To Sequence Text Generation With Diffusion Models Shansan Gong, Mukai Li, Jiangtao Feng, Zhiyong Wu, Lingpeng Kong
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Conversational Question Answering On Heterogeneous Sources Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Show-1: Marrying Pixel And Latent Diffusion Models For Text-to-video Generation David Junhao Zhang et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Recent Advances In Generative AI And Large Language Models: Current Status, Challenges, And Perspectives Desta Haileselassie Hagos, Rick Battle, Danda B. Rawat
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
🏷 Model Architecture
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- Learning Python Code Suggestion With A Sparse Pointer Network Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Attention Is All You Need Ashish Vaswani et al.
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Wizard Of Wikipedia: Knowledge-powered Conversational Agents Emily Dinan et al.
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- A Dataset For Document Grounded Conversations Kangyan Zhou, Shrimai Prabhumoye, Alan W Black
- Sequence-to-sequence Learning For Task-oriented Dialogue With Dialogue State Representation Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Seq2rdf: An End-to-end Application For Deriving Triples From Natural Language Text Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. Mcguinness
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- Pervasive Attention: 2D Convolutional Neural Networks For Sequence-to-sequence Prediction Maha Elbayad, Laurent Besacier, Jakob Verbeek
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Hierarchical Neural Story Generation Angela Fan, Mike Lewis, Yann Dauphin
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Myers-briggs Personality Classification And Personality-specific Language Generation Using Pre-trained Language Models Sedrick Scott Keh, I-tsun Cheng
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Passage Re-ranking With BERT Rodrigo Nogueira, Kyunghyun Cho
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Reqa: An Evaluation For End-to-end Answer Retrieval Models Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Frustratingly Easy Natural Question Answering Lin Pan et al.
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Linking Artificial And Human Neural Representations Of Language Jon Gauthier, Roger Levy
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Camembert: A Tasty French Language Model Louis Martin et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- BERT Has A Mouth, And It Must Speak: BERT As A Markov Random Field Language Model Alex Wang, Kyunghyun Cho
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- The Second Conversational Intelligence Challenge (convai2) Emily Dinan et al.
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- Exploiting Persona Information For Diverse Generation Of Conversational Responses Haoyu Song, Wei-nan Zhang, Yiming Cui, Dong Wang, Ting Liu
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Deep Learning Based Chatbot Models Richard Csaky
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Self-attentive Model For Headline Generation Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Gpt-based Generation For Classical Chinese Poetry Yi Liao, Yasheng Wang, Qun Liu, Xin Jiang
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Berts Of A Feather Do Not Generalize Together: Large Variability In Generalization Across Models With Similar Test Set Performance R. Thomas Mccoy, Junghyun Min, Tal Linzen
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Do Massively Pretrained Language Models Make Better Storytellers? Abigail See, Aneesh Pappu, Rohun Saxena, Akhila Yerukola, Christopher D. Manning
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Boolq: Exploring The Surprising Difficulty Of Natural Yes/no Questions Christopher Clark et al.
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Open-retrieval Conversational Question Answering Chen Qu et al.
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- Modifying Memories In Transformer Models Chen Zhu et al.
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- Talking-heads Attention Noam Shazeer, Zhenzhong Lan, Youlong Cheng, Nan Ding, Le Hou
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Progressive Generation Of Long Text With Pretrained Language Models Bowen Tan, Zichao Yang, Maruan Ai-shedivat, Eric P. Xing, Zhiting Hu
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Rikinet: Reading Wikipedia Pages For Natural Question Answering Dayiheng Liu et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- GRUEN For Evaluating Linguistic Quality Of Generated Text Wanzheng Zhu, Suma Bhat
- Artificial Intelligence Versus Maya Angelou: Experimental Evidence That People Cannot Differentiate Ai-generated From Human-written Poetry Nils Köbis, Luca Mossink
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- Non-autoregressive Machine Translation With Latent Alignments Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- Speaker-aware BERT For Multi-turn Response Selection In Retrieval-based Chatbots Jia-chen Gu et al.
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Code Prediction By Feeding Trees To Transformers Seohyun Kim, Jinman Zhao, Yuchi Tian, Satish Chandra
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Contrastive Code Representation Learning Paras Jain et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- Few-shot Natural Language Generation For Task-oriented Dialog Baolin Peng et al.
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- On Learning Universal Representations Across Languages Xiangpeng Wei et al.
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Narrative Interpolation For Generating And Understanding Stories Su Wang, Greg Durrett, Katrin Erk
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Plotmachines: Outline-conditioned Generation With Dynamic Plot State Tracking Hannah Rashkin, Asli Celikyilmaz, Yejin Choi, Jianfeng Gao
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Genaug: Data Augmentation For Finetuning Text Generators Steven Y. Feng, Varun Gangal, Dongyeop Kang, Teruko Mitamura, Eduard Hovy
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Imitation Attacks And Defenses For Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Language Model As An Annotator: Exploring Dialogpt For Dialogue Summarization Xiachong Feng, Xiaocheng Feng, Libo Qin, Bing Qin, Ting Liu
- Entailment As Few-shot Learner Sinong Wang, Han Fang, Madian Khabsa, Hanzi Mao, Hao Ma
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- A Token-level Reference-free Hallucination Detection Benchmark For Free-form Text Generation Tianyu Liu et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Thinking Aloud: Dynamic Context Generation Improves Zero-shot Reasoning Performance Of GPT-2 Gregor Betz, Kyle Richardson, Christian Voigt
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Language Models Are Few-shot Multilingual Learners Genta Indra Winata et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Retrieval Augmentation Reduces Hallucination In Conversation Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Focused Attention Improves Document-grounded Generation Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Finetuned Language Models Are Zero-shot Learners Jason Wei et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Emotion-aware Chat Machine: Automatic Emotional Response Generation For Human-like Emotional Interaction Wei Wei et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Language Model Evaluation Beyond Perplexity Clara Meister, Ryan Cotterell
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Dexperts: Decoding-time Controlled Text Generation With Experts And Anti-experts Alisa Liu et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Understanding The Capabilities, Limitations, And Societal Impact Of Large Language Models Alex Tamkin, Miles Brundage, Jack Clark, Deep Ganguli
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Learned Token Pruning For Transformers Sehoon Kim et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- GPT Understands, Too Xiao Liu et al.
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- On The Paradox Of Learning To Reason From Data Honghua Zhang, Liunian Harold Li, Tao Meng, Kai-wei Chang, Guy Van Den Broeck
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Thinking Fast And Slow In Large Language Models Thilo Hagendorff, Sarah Fabi, Michal Kosinski
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Uni-perceiver V2: A Generalist Model For Large-scale Vision And Vision-language Tasks Hao Li et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Using Large Language Models To Simulate Multiple Humans And Replicate Human Subject Studies Gati Aher, Rosa I. Arriaga, Adam Tauman Kalai
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- A Systematic Evaluation Of Large Language Models Of Code Frank F. Xu, Uri Alon, Graham Neubig, Vincent J. Hellendoorn
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Rethinking The Role Of Demonstrations: What Makes In-context Learning Work? Sewon Min et al.
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Zerogen: Efficient Zero-shot Learning Via Dataset Generation Jiacheng Ye et al.
- Teaching Models To Express Their Uncertainty In Words Stephanie Lin, Jacob Hilton, Owain Evans
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Confident Adaptive Language Modeling Tal Schuster et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Visual Prompt Tuning Menglin Jia et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- Evaluating Human-language Model Interaction Mina Lee et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Is GPT-3 A Good Data Annotator? Bosheng Ding et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Attributed Question Answering: Evaluation And Modeling For Attributed Large Language Models Bernd Bohnet et al.
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Ernie-search: Bridging Cross-encoder With Dual-encoder Via Self On-the-fly Distillation For Dense Passage Retrieval Yuxiang Lu et al.
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Language Models Of Code Are Few-shot Commonsense Learners Aman Madaan, Shuyan Zhou, Uri Alon, Yiming Yang, Graham Neubig
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Make-a-scene: Scene-based Text-to-image Generation With Human Priors Oran Gafni et al.
- Educational Question Generation Of Children Storybooks Via Question Type Distribution Learning And Event-centric Summarization Zhenjie Zhao et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- "this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Lamini-lm: A Diverse Herd Of Distilled Models From Large-scale Instructions Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed, Alham Fikri Aji
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Label Supervised Llama Finetuning Zongxi Li et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Voicebox: Text-guided Multilingual Universal Speech Generation At Scale Matthew Le et al.
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Human-ai Collaboration In Thematic Analysis Using Chatgpt: A User Study And Design Recommendations Lixiang Yan et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Give Us The Facts: Enhancing Large Language Models With Knowledge Graphs For Fact-aware Language Modeling Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, Xindong Wu
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Deep Learning Mental Health Dialogue System Lennart Brocki, George C. Dyer, Anna Gładka, Neo Christopher Chung
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Spear Phishing With Large Language Models Julian Hazell
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Simple And Controllable Music Generation Jade Copet et al.
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Llama: Open And Efficient Foundation Language Models Hugo Touvron et al.
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Personallm: Investigating The Ability Of Large Language Models To Express Personality Traits Hang Jiang et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Exploring The Psychology Of Llms' Moral And Legal Reasoning Guilherme F. C. F. Almeida, José Luiz Nunes, Neele Engelmann, Alex Wiegmann, Marcelo De Araújo
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Can A Student Large Language Model Perform As Well As It's Teacher? Sia Gholami, Marwan Omar
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Why Does Chatgpt Fall Short In Providing Truthful Answers? Shen Zheng, Jie Huang, Kevin Chen-chuan Chang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Verigen: A Large Language Model For Verilog Code Generation Shailja Thakur et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Ragas: Automated Evaluation Of Retrieval Augmented Generation Shahul Es, Jithin James, Luis Espinosa-anke, Steven Schockaert
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Palm 2 Technical Report Rohan Anil et al.
- Chatgpt Is Not All You Need. A State Of The Art Review Of Large Generative AI Models Roberto Gozalo-brizuela, Eduardo C. Garrido-merchan
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Starcoder: May The Source Be With You! Raymond Li et al.
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Embers Of Autoregression: Understanding Large Language Models Through The Problem They Are Trained To Solve R. Thomas Mccoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Lawyer Llama Technical Report Quzhe Huang et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Git-mol: A Multi-modal Large Language Model For Molecular Science With Graph, Image, And Text Pengfei Liu, Yiming Ren, Jun Tao, Zhixiang Ren
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- GPT Has Become Financially Literate: Insights From Financial Literacy Tests Of GPT And A Preliminary Test Of How People Use It As A Source Of Advice Paweł Niszczota, Sami Abbas
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- GPT-4 Technical Report Openai et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Exploring The Potential Of Large Language Models To Generate Formative Programming Feedback Natalie Kiesler, Dominic Lohr, Hieke Keuning
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Clever Hans Or Neural Theory Of Mind? Stress Testing Social Reasoning In Large Language Models Natalie Shapira et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- GPT-RE: In-context Learning For Relation Extraction Using Large Language Models Zhen Wan et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Is Chatgpt A Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation Tao Fang et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- Transformative Effects Of Chatgpt On Modern Education: Emerging Era Of AI Chatbots Sukhpal Singh Gill et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- Multilingual Machine Translation With Large Language Models: Empirical Results And Analysis Wenhao Zhu et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- Is Chatgpt Equipped With Emotional Dialogue Capabilities? Weixiang Zhao et al.
- Layoutgpt: Compositional Visual Planning And Generation With Large Language Models Weixi Feng et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Bias Of Ai-generated Content: An Examination Of News Produced By Large Language Models Xiao Fang et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Learning To Reason Over Scene Graphs: A Case Study Of Finetuning GPT-2 Into A Robot Language Model For Grounded Task Planning Georgia Chalvatzaki et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Moviechat: From Dense Token To Sparse Memory For Long Video Understanding Enxin Song et al.
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- REFINER: Reasoning Feedback On Intermediate Representations Debjit Paul et al.
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Show-1: Marrying Pixel And Latent Diffusion Models For Text-to-video Generation David Junhao Zhang et al.
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Response: Emergent Analogical Reasoning In Large Language Models Damian Hodel, Jevin West
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Is Chatgpt A General-purpose Natural Language Processing Task Solver? Chengwei Qin et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Chatgpt Is A Remarkable Tool -- For Experts Amos Azaria, Rina Azoulay, Shulamit Reches
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- A Categorical Archive Of Chatgpt Failures Ali Borji
- Large Language Models For Telecom: Forthcoming Impact On The Industry Ali Maatouk, Nicola Piovesan, Fadhel Ayed, Antonio De Domenico, Merouane Debbah
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mistral 7B Albert Q. Jiang et al.
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Calibrated Language Models Must Hallucinate Adam Tauman Kalai, Santosh S. Vempala
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- Phoenix: Democratizing Chatgpt Across Languages Zhihong Chen et al.
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Flexgen: High-throughput Generative Inference Of Large Language Models With A Single GPU Ying Sheng et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Improving Language Model Negotiation With Self-play And In-context Learning From AI Feedback Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Classeval: A Manually-crafted Benchmark For Evaluating Llms On Class-level Code Generation Xueying Du et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Learning Gain Differences Between Chatgpt And Human Tutor Generated Algebra Hints Zachary A. Pardos, Shreya Bhandari
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- AI And Memory Wall Amir Gholami et al.
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Multimodal Models
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Image Captioning For Effective Use Of Language Models In Knowledge-based Visual Question Answering Ander Salaberria, Gorka Azkune, Oier Lopez De Lacalle, Aitor Soroa, Eneko Agirre
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Uni-perceiver V2: A Generalist Model For Large-scale Vision And Vision-language Tasks Hao Li et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- "this Is My Unicorn, Fluffy": Personalizing Frozen Vision-language Representations Niv Cohen, Rinon Gal, Eli A. Meirom, Gal Chechik, Yuval Atzmon
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- Honeybee: Locality-enhanced Projector For Multimodal LLM Junbum Cha, Wooyoung Kang, Jonghwan Mun, Byungseok Roh
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- Mixphm: Redundancy-aware Parameter-efficient Tuning For Low-resource Visual Question Answering Jingjing Jiang, Nanning Zheng
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Glamm: Pixel Grounding Large Multimodal Model Hanoona Rasheed et al.
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Timechat: A Time-sensitive Multimodal Large Language Model For Long Video Understanding Shuhuai Ren, Linli Yao, Shicheng Li, Xu Sun, Lu Hou
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- GPT-4 Technical Report Openai et al.
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Mm-vet: Evaluating Large Multimodal Models For Integrated Capabilities Weihao Yu et al.
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- MMMU: A Massive Multi-discipline Multimodal Understanding And Reasoning Benchmark For Expert AGI Xiang Yue et al.
- Internvl: Scaling Up Vision Foundation Models And Aligning For Generic Visual-linguistic Tasks Zhe Chen et al.
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Hallucination Augmented Contrastive Learning For Multimodal Large Language Model Chaoya Jiang et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- MIMIC-IT: Multi-modal In-context Instruction Tuning Bo Li et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Vtimellm: Empower LLM To Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- RT-2: Vision-language-action Models Transfer Web Knowledge To Robotic Control Anthony Brohan et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Evaluating Object Hallucination In Large Vision-language Models Yifan Li et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- Llava-mr: Large Language-and-vision Assistant For Video Moment Retrieval Weiheng Lu et al.
- Earthgpt: A Universal Multi-modal Large Language Model For Multi-sensor Image Comprehension In Remote Sensing Domain Wei Zhang, Miaoxin Cai, Tong Zhang, Yin Zhuang, Xuerui Mao
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
🏷 NeurIPS
🏷 Pre-Training
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- CLEAR: Contrastive Learning For Sentence Representation Zhuofeng Wu et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Multilingual Denoising Pre-training For Neural Machine Translation Yinhan Liu et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Contrastive Code Representation Learning Paras Jain et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Pre-training Via Paraphrasing Mike Lewis et al.
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- EVA: An Open-domain Chinese Dialogue System With Large-scale Generative Pre-training Hao Zhou et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Grounded Language-image Pre-training Liunian Harold Li et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- XLM-E: Cross-lingual Language Model Pre-training Via ELECTRA Zewen Chi et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Unified Pre-training For Program Understanding And Generation Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Multi-task Pre-training For Plug-and-play Task-oriented Dialogue System Yixuan Su et al.
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Evolution Through Large Models Joel Lehman et al.
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- DS-1000: A Natural And Reliable Benchmark For Data Science Code Generation Yuhang Lai et al.
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Baichuan 2: Open Large-scale Language Models Aiyuan Yang et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- AI And Memory Wall Amir Gholami et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
🏷 Prompting
- Hierarchical Neural Story Generation Angela Fan, Mike Lewis, Yann Dauphin
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- How Can We Know What Language Models Know? Zhengbao Jiang, Frank F. Xu, Jun Araki, Graham Neubig
- Boolq: Exploring The Surprising Difficulty Of Natural Yes/no Questions Christopher Clark et al.
- The Radicalization Risks Of GPT-3 And Advanced Neural Language Models Kris Mcguffie, Alex Newhouse
- Content Planning For Neural Story Generation With Aristotelian Rescoring Seraphina Goldfarb-tarrant, Tuhin Chakrabarty, Ralph Weischedel, Nanyun Peng
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Collaborative Storytelling With Large-scale Neural Language Models Eric Nichols, Leo Gao, Randy Gomez
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Knowprompt: Knowledge-aware Prompt-tuning With Synergistic Optimization For Relation Extraction Xiang Chen et al.
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Controllable Generation From Pre-trained Language Models Via Inverse Prompting Xu Zou et al.
- PTR: Prompt Tuning With Rules For Text Classification Xu Han, Weilin Zhao, Ning Ding, Zhiyuan Liu, Maosong Sun
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Program Synthesis With Large Language Models Jacob Austin et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- Dialogue State Tracking With A Language Model Using Schema-driven Prompting Chia-hsuan Lee, Hao Cheng, Mari Ostendorf
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- Do Prompt-based Models Really Understand The Meaning Of Their Prompts? Albert Webson, Ellie Pavlick
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Spot: Better Frozen Model Adaptation Through Soft Prompt Transfer Tu Vu, Brian Lester, Noah Constant, Rami Al-rfou, Daniel Cer
- GPT Understands, Too Xiao Liu et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- Generated Knowledge Prompting For Commonsense Reasoning Jiacheng Liu et al.
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Progprompt: Generating Situated Robot Task Plans Using Large Language Models Ishika Singh et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Chain-of-thought Prompting Elicits Reasoning In Large Language Models Jason Wei et al.
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- The Unreliability Of Explanations In Few-shot Prompting For Textual Reasoning Xi Ye, Greg Durrett
- Biobart: Pretraining And Evaluation Of A Biomedical Generative Language Model Hongyi Yuan et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Demystifying Prompts In Language Models Via Perplexity Estimation Hila Gonen, Srini Iyer, Terra Blevins, Noah A. Smith, Luke Zettlemoyer
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Evaluating And Inducing Personality In Pre-trained Language Models Guangyuan Jiang et al.
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- A Very Preliminary Analysis Of DALL-E 2 Gary Marcus, Ernest Davis, Scott Aaronson
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Language Models That Seek For Knowledge: Modular Search & Generation For Dialogue And Prompt Completion Kurt Shuster et al.
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- Re3: Generating Longer Stories With Recursive Reprompting And Revision Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
- Chatgpt Makes Medicine Easy To Swallow: An Exploratory Case Study On Simplified Radiology Reports Katharina Jeblick et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Are Large Pre-trained Language Models Leaking Your Personal Information? Jie Huang, Hanyin Shao, Kevin Chen-chuan Chang
- Large Language Models Can Self-improve Jiaxin Huang et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- A Fine-grained Comparison Of Pragmatic Language Understanding In Humans And Language Models Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward Gibson
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- PAL: Program-aided Language Models Luyu Gao et al.
- Teaching Small Language Models To Reason Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adamek, Eric Malmi, Aliaksei Severyn
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Towards Using Few-shot Prompt Learning For Automating Model Completion Meriem Ben Chaaben, Lola Burgueño, Houari Sahraoui
- Visual Prompt Tuning Menglin Jia et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Self-consistency Improves Chain Of Thought Reasoning In Language Models Xuezhi Wang et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks Yufei Wang et al.
- Language Model Cascades David Dohan et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- Making Large Language Models Better Reasoners With Step-aware Verifier Yifei Li et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- Text And Patterns: For Effective Chain Of Thought, It Takes Two To Tango Aman Madaan, Amir Yazdanbakhsh
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Large Language Models Are Better Reasoners With Self-verification Yixuan Weng et al.
- Dual Modality Prompt Tuning For Vision-language Pre-trained Model Yinghui Xing et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- Large Language Models Are Human-level Prompt Engineers Yongchao Zhou et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Conversing With Copilot: Exploring Prompt Engineering For Solving CS1 Problems Using Natural Language Paul Denny, Viraj Kumar, Nasser Giacaman
- PINTO: Faithful Language Reasoning Using Prompt-generated Rationales Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Make-a-scene: Scene-based Text-to-image Generation With Human Priors Oran Gafni et al.
- Unnatural Instructions: Tuning Language Models With (almost) No Human Labor Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- Holistic Evaluation Of Language Models Percy Liang et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- On Second Thought, Let's Not Think Step By Step! Bias And Toxicity In Zero-shot Reasoning Omar Shaikh, Hongxin Zhang, William Held, Michael Bernstein, Diyi Yang
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Evaluating Large Language Models In Theory Of Mind Tasks Michal Kosinski
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Mvp: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Zhibin Gou, Qingyan Guo, Yujiu Yang
- Label Supervised Llama Finetuning Zongxi Li et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Errors Are Useful Prompts: Instruction Guided Task Programming With Verifier-assisted Iterative Prompting Marta Skreta et al.
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Dictionary-based Phrase-level Prompting Of Large Language Models For Machine Translation Marjan Ghazvininejad, Hila Gonen, Luke Zettlemoyer
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Next-step Hint Generation For Introductory Programming Using Large Language Models Lianne Roest, Hieke Keuning, Johan Jeuring
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- In-context Impersonation Reveals Large Language Models' Strengths And Biases Leonard Salewski, Stephan Alaniz, Isabel Rio-torto, Eric Schulz, Zeynep Akata
- Layoutllm-t2i: Eliciting Layout Guidance From LLM For Text-to-image Generation Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-seng Chua
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Geotechnical Parrot Tales (GPT): Harnessing Large Language Models In Geotechnical Engineering Krishna Kumar
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Large Language Models And Simple, Stupid Bugs Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Spear Phishing With Large Language Models Julian Hazell
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- A Unified Generative Retriever For Knowledge-intensive Language Tasks Via Prompt Learning Jiangui Chen et al.
- Compositional Exemplars For In-context Learning Jiacheng Ye, Zhiyong Wu, Jiangtao Feng, Tao Yu, Lingpeng Kong
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- GPT-3.5, GPT-4, Or BARD? Evaluating Llms Reasoning Ability In Zero-shot Setting And Performance Boosting Through Prompts Jessica López Espejel, El Hassane Ettifouri, Mahaman Sanoussi Yahaya Alassan, El Mehdi Chouham, Walid Dahhane
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Prompting Is Not A Substitute For Probability Measurements In Large Language Models Jennifer Hu, Roger Levy
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models Huiqiang Jiang, Qianhui Wu, Chin-yew Lin, Yuqing Yang, Lili Qiu
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Not All Languages Are Created Equal In Llms: Improving Multilingual Capability By Cross-lingual-thought Prompting Haoyang Huang et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- CMMLU: Measuring Massive Multitask Language Understanding In Chinese Haonan Li et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- Glamm: Pixel Grounding Large Multimodal Model Hanoona Rasheed et al.
- Prompting Large Language Models For Topic Modeling Han Wang et al.
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Choice Over Control: How Users Write With Large Language Models Using Diegetic And Non-diegetic Prompting Hai Dang, Sven Goller, Florian Lehmann, Daniel Buschek
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Chatgpt Is Fun, But It Is Not Funny! Humor Is Still Challenging Large Language Models Sophie Jentzsch, Kristian Kersting
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Retrieving Supporting Evidence For Generative Question Answering Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- LL3DA: Visual Interactive Instruction Tuning For Omni-3d Understanding, Reasoning, And Planning Sijin Chen et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Mathprompter: Mathematical Reasoning Using Large Language Models Shima Imani, Liang Du, Harsh Shrivastava
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- On Codex Prompt Engineering For OCL Generation: An Empirical Study Seif Abukhalaf, Mohammad Hamdaqa, Foutse Khomh
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Catalyst: Domain-extensible Intervention For Preventing Task Procrastination Using Large Generative Models Riku Arakawa, Hiromu Yakura, Masataka Goto
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- Starcoder: May The Source Be With You! Raymond Li et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Prompt Engineering A Prompt Engineer Qinyuan Ye, Maxamed Axmed, Reid Pryzant, Fereshte Khani
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Are Large Language Models Geospatially Knowledgeable? Prabin Bhandari, Antonios Anastasopoulos, Dieter Pfoser
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Automated Annotation With Generative AI Requires Validation Nicholas Pangakis, Samuel Wolken, Neil Fasching
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Introducing Language Guidance In Prompt-based Continual Learning Muhammad Gul Zain Ali Khan et al.
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- DIN-SQL: Decomposed In-context Learning Of Text-to-sql With Self-correction Mohammadreza Pourreza, Davood Rafiei
- A Review Of Chatgpt Applications In Education, Marketing, Software Engineering, And Healthcare: Benefits, Drawbacks, And Research Directions Mohammad Fraiwan, Natheer Khasawneh
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- Unleashing The Emergent Cognitive Synergy In Large Language Models: A Task-solving Agent Through Multi-persona Self-collaboration Zhenhailong Wang et al.
- MM-REACT: Prompting Chatgpt For Multimodal Reasoning And Action Zhengyuan Yang et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Fully Autonomous Programming With Large Language Models Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Better Patching Using LLM Prompting, Via Self-consistency Toufique Ahmed, Premkumar Devanbu
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- Enabling Large Language Models To Generate Text With Citations Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
- Diagnostic Reasoning Prompts Reveal The Potential For Large Language Model Interpretability In Medicine Thomas Savage, Ashwin Nayak, Robert Gallo, Ekanath Rangan, Jonathan H Chen
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Codekgc: Code Language Model For Generative Knowledge Graph Construction Zhen Bi et al.
- Chatgpt: Beginning Of An End Of Manual Linguistic Data Annotation? Use Case Of Automatic Genre Identification Taja Kuzman, Igor Mozetič, Nikola Ljubešić
- What Can Large Language Models Do In Chemistry? A Comprehensive Benchmark On Eight Tasks Taicheng Guo et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Evallm: Interactive Evaluation Of Large Language Model Prompts On User-defined Criteria Tae Soo Kim, Yoonjoo Lee, Jamin Shin, Young-ho Kim, Juho Kim
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Analyzing The Performance Of GPT-3.5 And GPT-4 In Grammatical Error Correction Steven Coyne, Keisuke Sakaguchi, Diana Galvan-sosa, Michael Zock, Kentaro Inui
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Multitask Prompt Tuning Enables Parameter-efficient Transfer Learning Zhen Wang et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Language Models Represent Space And Time Wes Gurnee, Max Tegmark
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Bias Of Ai-generated Content: An Examination Of News Produced By Large Language Models Xiao Fang et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Large Language Models Can Be Easily Distracted By Irrelevant Context Freda Shi et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Is Chatgpt Better Than Human Annotators? Potential And Limitations Of Chatgpt In Explaining Implicit Hate Speech Fan Huang, Haewoon Kwak, Jisun An
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- REFINER: Reasoning Feedback On Intermediate Representations Debjit Paul et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Have Llms Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models Daman Arora, Himanshu Gaurav Singh, Mausam
- Chatgpt Evaluation On Sentence Level Relations: A Focus On Temporal, Causal, And Discourse Relations Chunkit Chan et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Progressive-hint Prompting Improves Reasoning In Large Language Models Chuanyang Zheng, Zhengying Liu, Enze Xie, Zhenguo Li, Yu Li
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Better Zero-shot Reasoning With Role-play Prompting Aobo Kong et al.
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- What Does CLIP Know About A Red Circle? Visual Prompt Engineering For Vlms Aleksandar Shtedritski, Christian Rupprecht, Andrea Vedaldi
- How To Unleash The Power Of Large Language Models For Few-shot Relation Extraction? Xin Xu, Yuqi Zhu, Xiaohan Wang, Ningyu Zhang
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Prompting Large Language Models With Speech Recognition Abilities Yassir Fathullah et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Jailbreaking Chatgpt Via Prompt Engineering: An Empirical Study Yi Liu et al.
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- RTLLM: An Open-source Benchmark For Design RTL Generation With Large Language Model Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- March In Chat: Interactive Prompting For Remote Embodied Referring Expression Yanyuan Qiao, Yuankai Qi, Zheng Yu, Jing Liu, Qi Wu
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Speak Foreign Languages With Your Own Voice: Cross-lingual Neural Codec Language Modeling Ziqiang Zhang et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Spec: A Soft Prompt-based Calibration On Performance Variability Of Large Language Model In Clinical Notes Summarization Yu-neng Chuang, Ruixiang Tang, Xiaoqian Jiang, Xia Hu
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Autotamp: Autoregressive Task And Motion Planning With Llms As Translators And Checkers Yongchao Chen et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- Contextual AI Journaling: Integrating LLM And Time Series Behavioral Sensing Technology To Promote Self-reflection And Well-being Using The Mindscape App Subigya Nepal et al.
- Who Validates The Validators? Aligning Llm-assisted Evaluation Of LLM Outputs With Human Preferences Shreya Shankar, J. D. Zamfirescu-pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- Large Language Model Capabilities In Perioperative Risk Prediction And Prognostication Philip Chung et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- The Effect Of Sampling Temperature On Problem Solving In Large Language Models Matthew Renze, Erhan Guven
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Pruning
- Sequence-level Knowledge Distillation Yoon Kim, Alexander M. Rush
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
🏷 Quantization
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Structured Pruning Learns Compact And Accurate Models Mengzhou Xia, Zexuan Zhong, Danqi Chen
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Biomistral: A Collection Of Open-source Pretrained Large Language Models For Medical Domains Yanis Labrak et al.
🏷 RAG
- Topic Aware Neural Response Generation Chen Xing et al.
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- The LAMBADA Dataset: Word Prediction Requiring A Broad Discourse Context Denis Paperno et al.
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Constructing Datasets For Multi-hop Reading Comprehension Across Documents Johannes Welbl, Pontus Stenetorp, Sebastian Riedel
- Searchqa: A New Q&A Dataset Augmented With Context From A Search Engine Matthew Dunn et al.
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Triviaqa: A Large Scale Distantly Supervised Challenge Dataset For Reading Comprehension Mandar Joshi, Eunsol Choi, Daniel S. Weld, Luke Zettlemoyer
- Grounding Language For Transfer In Deep Reinforcement Learning Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola
- Another Diversity-promoting Objective Function For Neural Dialogue Generation Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- A Dataset For Document Grounded Conversations Kangyan Zhou, Shrimai Prabhumoye, Alan W Black
- Sequence-to-sequence Learning For Task-oriented Dialogue With Dialogue State Representation Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu
- Language Gans Falling Short Massimo Caccia et al.
- Seq2rdf: An End-to-end Application For Deriving Triples From Natural Language Text Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. Mcguinness
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context Urvashi Khandelwal, He He, Peng Qi, Dan Jurafsky
- Ranking Paragraphs For Improving Answer Recall In Open-domain Question Answering Jinhyuk Lee, Seongjun Yun, Hyunjae Kim, Miyoung Ko, Jaewoo Kang
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- Complex Sequential Question Answering: Towards Learning To Converse Over Linked Question Answer Pairs With A Knowledge Graph Amrita Saha, Vardaan Pahuja, Mitesh M. Khapra, Karthik Sankaranarayanan, Sarath Chandar
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- Building A Conversational Agent Overnight With Dialogue Self-play Pararth Shah et al.
- Emrqa: A Large Corpus For Question Answering On Electronic Medical Records Anusri Pampari, Preethi Raghavan, Jennifer Liang, Jian Peng
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Reqa: An Evaluation For End-to-end Answer Retrieval Models Amin Ahmad, Noah Constant, Yinfei Yang, Daniel Cer
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Enabling Robots To Understand Incomplete Natural Language Instructions Using Commonsense Reasoning Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Scalable Attentive Sentence-pair Modeling Via Distilled Sentence Embedding Oren Barkan et al.
- Jointly Optimizing Diversity And Relevance In Neural Response Generation Xiang Gao et al.
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- Real-time Open-domain Question Answering With Dense-sparse Phrase Index Minjoon Seo et al.
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Sentence-level Content Planning And Style Specification For Neural Text Generation Xinyu Hua, Lu Wang
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Personalizing Dialogue Agents Via Meta-learning Zhaojiang Lin, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Learning To Retrieve Reasoning Paths Over Wikipedia Graph For Question Answering Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Incorporating External Knowledge Into Machine Reading For Generative Question Answering Bin Bi et al.
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Rankqa: Neural Question Answering With Answer Re-ranking Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Retrieve, Read, Rerank: Towards End-to-end Multi-document Reading Comprehension Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Modeling Recurrence For Transformer Jie Hao et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Improving Neural Response Diversity With Frequency-aware Cross-entropy Loss Shaojie Jiang, Pengjie Ren, Christof Monz, Maarten De Rijke
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Gmail Smart Compose: Real-time Assisted Writing Mia Xu Chen et al.
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Learning To Select Knowledge For Response Generation In Dialog Systems Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
- Knowledge Aware Conversation Generation With Explainable Reasoning Over Augmented Graphs Zhibin Liu, Zheng-yu Niu, Hua Wu, Haifeng Wang
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Unsupervised Paraphrase Generation Using Pre-trained Language Models Chaitra Hegde, Shrikumar Patil
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- Like Hiking? You Probably Enjoy Nature: Persona-grounded Dialog With Commonsense Expansions Bodhisattwa Prasad Majumder, Harsh Jhamtani, Taylor Berg-kirkpatrick, Julian Mcauley
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Towards A Human-like Open-domain Chatbot Daniel Adiwardana et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Rikinet: Reading Wikipedia Pages For Natural Question Answering Dayiheng Liu et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Openvidial: A Large-scale, Open-domain Dialogue Dataset With Visual Contexts Yuxian Meng et al.
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- Asking Questions The Human Way: Scalable Question-answer Generation From Text Corpus Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Nearest Neighbor Machine Translation Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- Leveraging Passage Retrieval With Generative Models For Open Domain Question Answering Gautier Izacard, Edouard Grave
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Doc2dial: A Goal-oriented Document-grounded Dialogue Dataset Song Feng et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- Imitation Attacks And Defenses For Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song
- Knowledge-grounded Dialogue Generation With Pre-trained Language Models Xueliang Zhao et al.
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- FILM: Following Instructions In Language With Modular Methods So Yeon Min, Devendra Singh Chaplot, Pradeep Ravikumar, Yonatan Bisk, Ruslan Salakhutdinov
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Parallel Refinements For Lexically Constrained Text Generation With BART Xingwei He
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Grounded Language-image Pre-training Liunian Harold Li et al.
- Vision Guided Generative Pre-trained Language Models For Multimodal Abstractive Summarization Tiezheng Yu, Wenliang Dai, Zihan Liu, Pascale Fung
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Prompt Programming For Large Language Models: Beyond The Few-shot Paradigm Laria Reynolds, Kyle Mcdonell
- Reframing Instructional Prompts To Gptk's Language Swaroop Mishra, Daniel Khashabi, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- End-to-end Training Of Multi-document Reader And Retriever For Open-domain Question Answering Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Efficient Retrieval Augmented Generation From Unstructured Knowledge For Task-oriented Dialog David Thulke, Nico Daheim, Christian Dugast, Hermann Ney
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Empowering News Recommendation With Pre-trained Language Models Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Generating Datasets With Pretrained Language Models Timo Schick, Hinrich Schütze
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Neural Path Hunter: Reducing Hallucination In Dialogue Systems Via Path Grounding Nouha Dziri, Andrea Madotto, Osmar Zaiane, Avishek Joey Bose
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Baleen: Robust Multi-hop Reasoning At Scale Via Condensed Retrieval Omar Khattab, Christopher Potts, Matei Zaharia
- An Empirical Study Of GPT-3 For Few-shot Knowledge-based VQA Zhengyuan Yang et al.
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- AI Chains: Transparent And Controllable Human-ai Interaction By Chaining Large Language Model Prompts Tongshuang Wu, Michael Terry, Carrie J. Cai
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- PAQ: 65 Million Probably-asked Questions And What You Can Do With Them Patrick Lewis et al.
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Defending Against Backdoor Attacks In Natural Language Generation Xiaofei Sun et al.
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Few-shot Knowledge Graph-to-text Generation With Pretrained Language Models Junyi Li et al.
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- Compm: Context Modeling With Speaker's Pre-trained Memory Tracking For Emotion Recognition In Conversation Joosung Lee, Wooin Lee
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Quality: Question Answering With Long Input Texts, Yes! Richard Yuanzhe Pang et al.
- Improving Question Answering Model Robustness With Synthetic Adversarial Data Generation Max Bartolo et al.
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- What Makes Good In-context Examples For GPT-\(3\)? Jiachang Liu et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Scaling Instruction-finetuned Language Models Hyung Won Chung et al.
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- Interactive Code Generation Via Test-driven User-intent Formalization Shuvendu K. Lahiri et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Recitation-augmented Language Models Zhiqing Sun, Xuezhi Wang, Yi Tay, Yiming Yang, Denny Zhou
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Evaluating And Inducing Personality In Pre-trained Language Models Guangyuan Jiang et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Retrieval-augmented Generative Question Answering For Event Argument Extraction Xinya Du, Heng Ji
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Distilling Reasoning Capabilities Into Smaller Language Models Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan
- Measuring Progress On Scalable Oversight For Large Language Models Samuel R. Bowman et al.
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Language Models (mostly) Know What They Know Saurav Kadavath et al.
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Can Large Language Models Truly Understand Prompts? A Case Study With Negated Prompts Joel Jang, Seonghyeon Ye, Minjoon Seo
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- A Fine-grained Comparison Of Pragmatic Language Understanding In Humans And Language Models Jennifer Hu, Sammy Floyd, Olessia Jouravlev, Evelina Fedorenko, Edward Gibson
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Challenging Big-bench Tasks And Whether Chain-of-thought Can Solve Them Mirac Suzgun et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Visual Prompt Tuning Menglin Jia et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Self-consistency Improves Chain Of Thought Reasoning In Language Models Xuezhi Wang et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Putting Gpt-3's Creativity To The (alternative Uses) Test Claire Stevenson, Iris Smal, Matthijs Baas, Raoul Grasman, Han Van Der Maas
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- Codet: Code Generation With Generated Tests Bei Chen et al.
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Automatic Chain Of Thought Prompting In Large Language Models Zhuosheng Zhang, Aston Zhang, Mu Li, Alex Smola
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- When Not To Trust Language Models: Investigating Effectiveness Of Parametric And Non-parametric Memories Alex Mallen et al.
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Towards The Next 1000 Languages In Multilingual Machine Translation: Exploring The Synergy Between Supervised And Self-supervised Learning Aditya Siddhant et al.
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Learn To Explain: Multimodal Reasoning Via Thought Chains For Science Question Answering Pan Lu et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Educational Question Generation Of Children Storybooks Via Question Type Distribution Learning And Event-centric Summarization Zhenjie Zhao et al.
- Holistic Evaluation Of Language Models Percy Liang et al.
- Conversational Question Answering On Heterogeneous Sources Philipp Christmann, Rishiraj Saha Roy, Gerhard Weikum
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Program Of Thoughts Prompting: Disentangling Computation From Reasoning For Numerical Reasoning Tasks Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Talking About Large Language Models Murray Shanahan
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Lamini-lm: A Diverse Herd Of Distilled Models From Large-scale Instructions Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed, Alham Fikri Aji
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- Mvp: Multi-view Prompting Improves Aspect Sentiment Tuple Prediction Zhibin Gou, Qingyan Guo, Yujiu Yang
- Label Supervised Llama Finetuning Zongxi Li et al.
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Nl2spec: Interactively Translating Unstructured Natural Language To Temporal Logics With Large Language Models Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel
- Large Language Models Effectively Leverage Document-level Context For Literary Translation, But Critical Errors Persist Marzena Karpinska, Mohit Iyyer
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Surgicalgpt: End-to-end Language-vision GPT For Visual Question Answering In Surgery Lalithkumar Seenivasan, Mobarakol Islam, Gokul Kannan, Hongliang Ren
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- News Verifiers Showdown: A Comparative Performance Evaluation Of Chatgpt 3.5, Chatgpt 4.0, Bing AI, And Bard In News Fact-checking Kevin Matthe Caramancion
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Topical-chat: Towards Knowledge-grounded Open-domain Conversations Karthik Gopalakrishnan et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Mixphm: Redundancy-aware Parameter-efficient Tuning For Low-resource Visual Question Answering Jingjing Jiang, Nanning Zheng
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Benchmarking Large Language Models In Retrieval-augmented Generation Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Symbol Tuning Improves In-context Learning In Language Models Jerry Wei et al.
- Leveraging Large Language Models For Sequential Recommendation Jesse Harte et al.
- Evaluating Large Language Models On A Highly-specialized Topic, Radiation Oncology Physics Jason Holmes et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Paperqa: Retrieval-augmented Generative Agent For Scientific Research Jakub Lála et al.
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Chatgpt In The Classroom: An Analysis Of Its Strengths And Weaknesses For Solving Undergraduate Computer Science Questions Ishika Joshi et al.
- Cognitive Mirage: A Review Of Hallucinations In Large Language Models Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, Weiqiang Jia
- Semantic Compression With Large Language Models Henry Gilbert, Michael Sandborn, Douglas C. Schmidt, Jesse Spencer-smith, Jules White
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Not All Languages Are Created Equal In Llms: Improving Multilingual Capability By Cross-lingual-thought Prompting Haoyang Huang et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- CMMLU: Measuring Massive Multitask Language Understanding In Chinese Haonan Li et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Q-instruct: Improving Low-level Visual Abilities For Multi-modality Foundation Models Haoning Wu et al.
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Large Language Models Are Effective Text Rankers With Pairwise Ranking Prompting Zhen Qin et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Ragas: Automated Evaluation Of Retrieval Augmented Generation Shahul Es, Jithin James, Luis Espinosa-anke, Steven Schockaert
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- Prompting For Multimodal Hateful Meme Classification Rui Cao, Roy Ka-wei Lee, Wen-haw Chong, Jing Jiang
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Translating Radiology Reports Into Plain Language Using Chatgpt And GPT-4 With Prompt Learning: Promising Results, Limitations, And Potential Qing Lyu et al.
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Selfcheckgpt: Zero-resource Black-box Hallucination Detection For Generative Large Language Models Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Towards Understanding Sycophancy In Language Models Mrinank Sharma et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Empirical Study Of Zero-shot NER With Chatgpt Tingyu Xie et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Large Language Models Fail On Trivial Alterations To Theory-of-mind Tasks Tomer Ullman
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Mindfuldiary: Harnessing Large Language Model To Support Psychiatric Patients' Journaling Taewan Kim et al.
- Delving Into Multimodal Prompting For Fine-grained Visual Classification Xin Jiang et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Universalner: Targeted Distillation From Large Language Models For Open Named Entity Recognition Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Chatkbqa: A Generate-then-retrieve Framework For Knowledge Base Question Answering With Fine-tuned Large Language Models Haoran Luo et al.
- Learning To Prompt In The Classroom To Understand AI Limits: A Pilot Study Emily Theophilou et al.
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- Language Model Crossover: Variation Through Few-shot Prompting Elliot Meyerson et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Conversational Automated Program Repair Chunqiu Steven Xia, Lingming Zhang
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models Antoine Louis, Gijs Van Dijck, Gerasimos Spanakis
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Fighting Fire With Fire: Can Chatgpt Detect Ai-generated Text? Amrita Bhattacharjee, Huan Liu
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Poisoning Language Models During Instruction Tuning Alexander Wan, Eric Wallace, Sheng Shen, Dan Klein
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mistral 7B Albert Q. Jiang et al.
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Can Chatgpt Reproduce Human-generated Labels? A Study Of Social Computing Tasks Yiming Zhu, Peixian Zhang, Ehsan-ul Haq, Pan Hui, Gareth Tyson
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Llm-eval: Unified Multi-dimensional Automatic Evaluation For Open-domain Conversations With Large Language Models Yen-ting Lin, Yun-nung Chen
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Translating Natural Language To Planning Goals With Large-language Models Yaqi Xie et al.
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Specinfer: Accelerating Generative Large Language Model Serving With Tree-based Speculative Inference And Verification Xupeng Miao et al.
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Large Language Models Are Versatile Decomposers: Decompose Evidence And Questions For Table-based Reasoning Yunhu Ye et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Hugginggpt: Solving AI Tasks With Chatgpt And Its Friends In Hugging Face Yongliang Shen et al.
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Teaching Large Language Models To Self-debug Xinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou
- Mitigating Large Language Model Hallucinations Via Autonomous Knowledge Graph-based Retrofitting Xinyan Guan et al.
- Wavcaps: A Chatgpt-assisted Weakly-labelled Audio Captioning Dataset For Audio-language Multimodal Research Xinhao Mei et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Contextual AI Journaling: Integrating LLM And Time Series Behavioral Sensing Technology To Promote Self-reflection And Well-being Using The Mindscape App Subigya Nepal et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- When Large Language Model Agents Meet 6G Networks: Perception, Grounding, And Alignment Minrui Xu et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Benchmarking Retrieval-augmented Generation For Medicine Guangzhi Xiong, Qiao Jin, Zhiyong Lu, Aidong Zhang
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- The Power Of Noise: Redefining Retrieval For RAG Systems Florin Cuconasu et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
🏷 RecSys
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Towards Unified Conversational Recommender Systems Via Knowledge-enhanced Prompt Learning Xiaolei Wang, Kun Zhou, Ji-rong Wen, Wayne Xin Zhao
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Uncovering Chatgpt's Capabilities In Recommender Systems Sunhao Dai et al.
- Generative Recommendation: Towards Next-generation Recommender Paradigm Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, Tat-seng Chua
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Collaborative Large Language Model For Recommender Systems Yaochen Zhu, Liang Wu, Qi Guo, Liangjie Hong, Jundong Li
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Large Language Models Are Zero-shot Rankers For Recommender Systems Yupeng Hou et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
🏷 Reinforcement Learning
- Neural Text Generation From Structured Data With Application To The Biography Domain Remi Lebret, David Grangier, Michael Auli
- Deep Active Learning For Dialogue Generation Nabiha Asghar, Pascal Poupart, Xin Jiang, Hang Li
- A Simple, Fast Diverse Decoding Algorithm For Neural Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Separating Answers From Queries For Neural Reading Comprehension Dirk Weissenborn
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- Deep Reinforcement Learning For Dialogue Generation Jiwei Li et al.
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Learning Python Code Suggestion With A Sparse Pointer Network Avishkar Bhoopchand, Tim Rocktäschel, Earl Barr, Sebastian Riedel
- Neural Response Generation With Dynamic Vocabularies Yu Wu et al.
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Ask The Right Questions: Active Question Reformulation With Reinforcement Learning Christian Buck et al.
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Batch Policy Gradient Methods For Improving Neural Conversation Models Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- A Deep Reinforcement Learning Chatbot Iulian V. Serban et al.
- Neural Text Generation: A Practical Guide Ziang Xie
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Searchqa: A New Q&A Dataset Augmented With Context From A Search Engine Matthew Dunn et al.
- R\(^3\): Reinforced Reader-ranker For Open-domain Question Answering Shuohang Wang et al.
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Sample-efficient Actor-critic Reinforcement Learning With Supervised Data For Dialogue Management Pei-hao Su, Pawel Budzianowski, Stefan Ultes, Milica Gasic, Steve Young
- Iris: A Conversational Agent For Complex Tasks Ethan Fast, Binbin Chen, Julia Mendelsohn, Jonathan Bassen, Michael Bernstein
- Grounding Language For Transfer In Deep Reinforcement Learning Karthik Narasimhan, Regina Barzilay, Tommi Jaakkola
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Towards Explainable And Controllable Open Domain Dialogue Generation With Dialogue Acts Can Xu, Wei Wu, Yu Wu
- Seq2seq-vis: A Visual Debugging Tool For Sequence-to-sequence Models Hendrik Strobelt et al.
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Disentangling Language And Knowledge In Task-oriented Dialogs Dinesh Raghu, Nikhil Gupta, Mausam
- Hybrid Retrieval-generation Reinforced Agent For Medical Image Report Generation Christy Y. Li, Xiaodan Liang, Zhiting Hu, Eric P. Xing
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Towards Exploiting Background Knowledge For Building Conversation Systems Nikita Moghe, Siddhartha Arora, Suman Banerjee, Mitesh M. Khapra
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- Complex Sequential Question Answering: Towards Learning To Converse Over Linked Question Answer Pairs With A Knowledge Graph Amrita Saha, Vardaan Pahuja, Mitesh M. Khapra, Karthik Sankaranarayanan, Sarath Chandar
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Controllable Neural Story Plot Generation Via Reward Shaping Pradyumna Tambwekar et al.
- On Evaluating And Comparing Open Domain Dialog Systems Anu Venkatesh et al.
- Guiding Policies With Language Via Meta-learning John D. Co-reyes et al.
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Answering Complex Open-domain Questions Through Iterative Query Generation Peng Qi, Xiaowen Lin, Leo Mehr, Zijian Wang, Christopher D. Manning
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- ELI5: Long Form Question Answering Angela Fan et al.
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Counterfactual Story Reasoning And Generation Lianhui Qin et al.
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Conversing By Reading: Contentful Neural Conversation With On-demand Machine Reading Lianhui Qin et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Is Multilingual BERT Fluent In Language Generation? Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- Language As An Abstraction For Hierarchical Deep Reinforcement Learning Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Do Neural Language Representations Learn Physical Commonsense? Maxwell Forbes, Ari Holtzman, Yejin Choi
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Rankqa: Neural Question Answering With Answer Re-ranking Bernhard Kratzwald, Anna Eigenmann, Stefan Feuerriegel
- Dykgchat: Benchmarking Dialogue Generation Grounding On Dynamic Knowledge Graphs Yi-lin Tuan, Yun-nung Chen, Hung-yi Lee
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Abductive Commonsense Reasoning Chandra Bhagavatula et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Latent Retrieval For Weakly Supervised Open Domain Question Answering Kenton Lee, Ming-wei Chang, Kristina Toutanova
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- QASC: A Dataset For Question Answering Via Sentence Composition Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, Ashish Sabharwal
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Towards Scalable Multi-domain Conversational Agents: The Schema-guided Dialogue Dataset Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Reinforcement Learning Based Emotional Editing Constraint Conversation Generation Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jianhua Tao
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- KG-BART: Knowledge Graph-augmented BART For Generative Commonsense Reasoning Ye Liu, Yao Wan, Lifang He, Hao Peng, Philip S. Yu
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Masking As An Efficient Alternative To Finetuning For Pretrained Language Models Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- CG-BERT: Conditional Text Generation With BERT For Generalized Few-shot Intent Detection Congying Xia, Chenwei Zhang, Hoang Nguyen, Jiawei Zhang, Philip Yu
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- If Beam Search Is The Answer, What Was The Question? Clara Meister, Tim Vieira, Ryan Cotterell
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- Detecting Hallucinated Content In Conditional Neural Sequence Generation Chunting Zhou et al.
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- X-FACTR: Multilingual Factual Knowledge Retrieval From Pretrained Language Models Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki, Haibo Ding, Graham Neubig
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- You Impress Me: Dialogue Generation Via Mutual Persona Perception Qian Liu et al.
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- The Turking Test: Can Language Models Understand Instructions? Avia Efrat, Omer Levy
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Beyond English-centric Multilingual Machine Translation Angela Fan et al.
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- Question And Answer Test-train Overlap In Open-domain Question Answering Datasets Patrick Lewis, Pontus Stenetorp, Sebastian Riedel
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- XTREME: A Massively Multilingual Multi-task Benchmark For Evaluating Cross-lingual Generalization Junjie Hu et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- DUMA: Reading Comprehension With Transposition Thinking Pengfei Zhu, Hai Zhao, Xiaoguang Li
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- PONE: A Novel Automatic Evaluation Metric For Open-domain Generative Dialogue Systems Tian Lan, Xian-ling Mao, Wei Wei, Xiaoyan Gao, Heyan Huang
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Scientific Claim Verification With VERT5ERINI Ronak Pradeep, Xueguang Ma, Rodrigo Nogueira, Jimmy Lin
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- How Context Affects Language Models' Factual Predictions Fabio Petroni et al.
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Less Is More: Pre-train A Strong Text Encoder For Dense Retrieval Using A Weak Decoder Shuqi Lu et al.
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Program Synthesis With Large Language Models Jacob Austin et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Emotion-aware Chat Machine: Automatic Emotional Response Generation For Human-like Emotional Interaction Wei Wei et al.
- RAFT: A Real-world Few-shot Text Classification Benchmark Neel Alex et al.
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- The Impact Of Multiple Parallel Phrase Suggestions On Email Input And Composition Behaviour Of Native And Non-native English Writers Daniel Buschek, Martin Zürn, Malin Eiband
- Can Generative Pre-trained Language Models Serve As Knowledge Bases For Closed-book QA? Cunxiang Wang, Pai Liu, Yue Zhang
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Exploring Prompt-based Few-shot Learning For Grounded Dialog Generation Chujie Zheng, Minlie Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Automated Quality Assessment Of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations Nikolaos Flemotomos et al.
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners Ningyu Zhang et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- Predicting The Performance Of Multilingual NLP Models Anirudh Srinivasan et al.
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Image Captioning For Effective Use Of Language Models In Knowledge-based Visual Question Answering Ander Salaberria, Gorka Azkune, Oier Lopez De Lacalle, Aitor Soroa, Eneko Agirre
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Bertese: Learning To Speak To BERT Adi Haviv, Jonathan Berant, Amir Globerson
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Towards Retrieval-based Conversational Recommendation Ahtsham Manzoor, Dietmar Jannach
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- Towards Continual Knowledge Learning Of Language Models Joel Jang et al.
- SIMMC 2.0: A Task-oriented Dialog Dataset For Immersive Multimodal Conversations Satwik Kottur, Seungwhan Moon, Alborz Geramifard, Babak Damavandi
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Beyond Goldfish Memory: Long-term Open-domain Conversation Jing Xu, Arthur Szlam, Jason Weston
- Quality: Question Answering With Long Input Texts, Yes! Richard Yuanzhe Pang et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- True Few-shot Learning With Prompts -- A Real-world Perspective Timo Schick, Hinrich Schütze
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Reframing Human-ai Collaboration For Generating Free-text Explanations Sarah Wiegreffe, Jack Hessel, Swabha Swayamdipta, Mark Riedl, Yejin Choi
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Language Models As Agent Models Jacob Andreas
- Evaluating Mixed-initiative Conversational Search Systems Via User Simulation Ivan Sekulić, Mohammad Aliannejadi, Fabio Crestani
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Progprompt: Generating Situated Robot Task Plans Using Large Language Models Ishika Singh et al.
- Language Models Show Human-like Content Effects On Reasoning Tasks Ishita Dasgupta et al.
- Webshop: Towards Scalable Real-world Web Interaction With Grounded Language Agents Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Less Is More: Learning To Refine Dialogue History For Personalized Dialogue Generation Hanxun Zhong, Zhicheng Dou, Yutao Zhu, Hongjin Qian, Ji-rong Wen
- How To Prompt? Opportunities And Challenges Of Zero- And Few-shot Learning For Human-ai Interaction In Creative Applications Of Generative Models Hai Dang, Lukas Mecke, Florian Lehmann, Sven Goller, Daniel Buschek
- In-context Examples Selection For Machine Translation Sweta Agrawal, Chunting Zhou, Mike Lewis, Luke Zettlemoyer, Marjan Ghazvininejad
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Language Models Are Multilingual Chain-of-thought Reasoners Freda Shi et al.
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Memory-based Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Galactica: A Large Language Model For Science Ross Taylor et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Self-conditioned Embedding Diffusion For Text Generation Robin Strudel et al.
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Planbench: An Extensible Benchmark For Evaluating Large Language Models On Planning And Reasoning About Change Karthik Valmeekam, Matthew Marquez, Alberto Olmo, Sarath Sreedharan, Subbarao Kambhampati
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- Do Large Language Models Know What Humans Know? Sean Trott, Cameron Jones, Tyler Chang, James Michaelov, Benjamin Bergen
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Evolution Through Large Models Joel Lehman et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- Confident Adaptive Language Modeling Tal Schuster et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Visconde: Multi-document QA With GPT-3 And Neural Reranking Jayr Pereira, Robson Fidalgo, Roberto Lotufo, Rodrigo Nogueira
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- Large Language Models Are Few(1)-shot Table Reasoners Wenhu Chen
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Do As I Can, Not As I Say: Grounding Language In Robotic Affordances Michael Ahn et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Coauthor: Designing A Human-ai Collaborative Writing Dataset For Exploring Language Model Capabilities Mina Lee, Percy Liang, Qian Yang
- Evaluating Human-language Model Interaction Mina Lee et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Greaselm: Graph Reasoning Enhanced Language Models For Question Answering Xikun Zhang et al.
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Successive Prompting For Decomposing Complex Questions Dheeru Dua, Shivanshu Gupta, Sameer Singh, Matt Gardner
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Convfinqa: Exploring The Chain Of Numerical Reasoning In Conversational Finance Question Answering Zhiyu Chen et al.
- Prompting Palm For Translation: Assessing Strategies And Performance David Vilar et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Unified Vision And Language Prompt Learning Yuhang Zang, Wei Li, Kaiyang Zhou, Chen Huang, Chen Change Loy
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Analogy Generation By Prompting Large Language Models: A Case Study Of Instructgpt Bhavya Bhavya, Jinjun Xiong, Chengxiang Zhai
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Language Models Are General-purpose Interfaces Yaru Hao et al.
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Faithful Reasoning Using Large Language Models Antonia Creswell, Murray Shanahan
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- The AI Teacher Test: Measuring The Pedagogical Ability Of Blender And GPT-3 In Educational Dialogues Anaïs Tack, Chris Piech
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- When Not To Trust Language Models: Investigating Effectiveness Of Parametric And Non-parametric Memories Alex Mallen et al.
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- Solving Quantitative Reasoning Problems With Language Models Aitor Lewkowycz et al.
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- What Is It Like To Program With Artificial Intelligence? Advait Sarkar et al.
- Language Models Are Greedy Reasoners: A Systematic Formal Analysis Of Chain-of-thought Abulhair Saparov, He He
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- Can Language Models Learn From Explanations In Context? Andrew K. Lampinen et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Promptcap: Prompt-guided Task-aware Image Captioning Yushi Hu et al.
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- Conversing With Copilot: Exploring Prompt Engineering For Solving CS1 Problems Using Natural Language Paul Denny, Viraj Kumar, Nasser Giacaman
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Holistic Evaluation Of Language Models Percy Liang et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Can Large Language Models Reason About Medical Questions? Valentin Liévin, Christoffer Egeberg Hother, Andreas Geert Motzfeldt, Ole Winther
- ROSCOE: A Suite Of Metrics For Scoring Step-by-step Reasoning Olga Golovneva et al.
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models? Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- Faithdial: A Faithful Benchmark For Information-seeking Dialogue Nouha Dziri et al.
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Maple: Multi-modal Prompt Learning Muhammad Uzair Khattak, Hanoona Rasheed, Muhammad Maaz, Salman Khan, Fahad Shahbaz Khan
- Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators Wenhao Yu et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- H\(_2\)O: Heavy-hitter Oracle For Efficient Generative Inference Of Large Language Models Zhenyu Zhang et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- A Systematic Study And Comprehensive Evaluation Of Chatgpt On Benchmark Datasets Md Tahmid Rahman Laskar et al.
- Label Supervised Llama Finetuning Zongxi Li et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- Describe, Explain, Plan And Select: Interactive Planning With Large Language Models Enables Open-world Multi-task Agents Zihao Wang et al.
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Comparing Sentence-level Suggestions To Message-level Suggestions In Ai-mediated Communication Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- Llm-grounded Diffusion: Enhancing Prompt Understanding Of Text-to-image Diffusion Models With Large Language Models Long Lian, Boyi Li, Adam Yala, Trevor Darrell
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- Next-step Hint Generation For Introductory Programming Using Large Language Models Lianne Roest, Hieke Keuning, Johan Jeuring
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Domain-specific Chatbots For Science Using Embeddings Kevin G. Yager
- Inference-time Intervention: Eliciting Truthful Answers From A Language Model Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, Martin Wattenberg
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- Geochat: Grounded Large Vision-language Model For Remote Sensing Kartik Kuckreja et al.
- Topical-chat: Towards Knowledge-grounded Open-domain Conversations Karthik Gopalakrishnan et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- The Imitation Game: Detecting Human And Ai-generated Texts In The Era Of Chatgpt And BARD Kadhim Hayawi, Sakib Shahriar, Sujith Samuel Mathew
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Writer-defined AI Personas For On-demand Feedback Generation Karim Benharrak, Tim Zindulka, Florian Lehmann, Hendrik Heuer, Daniel Buschek
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Is Chatgpt A Good Recommender? A Preliminary Study Junling Liu et al.
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Spear Phishing With Large Language Models Julian Hazell
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- Towards Llm-based Autograding For Short Textual Answers Johannes Schneider, Bernd Schenk, Christina Niklaus
- The Political Ideology Of Conversational AI: Converging Evidence On Chatgpt's Pro-environmental, Left-libertarian Orientation Jochen Hartmann, Jasper Schwenzow, Maximilian Witte
- Is Chatgpt Fair For Recommendation? Evaluating Fairness In Large Language Model Recommendation Jizhi Zhang et al.
- Qwen Technical Report Jinze Bai et al.
- "it's A Fair Game", Or Is It? Examining How Users Navigate Disclosure Risks And Benefits When Using Llm-based Conversational Agents Zhiping Zhang et al.
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- The Potential And Pitfalls Of Using A Large Language Model Such As Chatgpt Or GPT-4 As A Clinical Assistant Jingqing Zhang et al.
- Generating Images With Multimodal Language Models Jing Yu Koh, Daniel Fried, Ruslan Salakhutdinov
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Set-of-mark Prompting Unleashes Extraordinary Visual Grounding In GPT-4V Jianwei Yang et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- Thrilled By Your Progress! Large Language Models (GPT-4) No Longer Struggle To Pass Assessments In Higher Education Programming Courses Jaromir Savelka, Arav Agarwal, Marshall An, Chris Bogart, Majd Sakr
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chip-chat: Challenges And Opportunities In Conversational Hardware Design Jason Blocklove, Siddharth Garg, Ramesh Karri, Hammond Pearce
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Simple And Controllable Music Generation Jade Copet et al.
- A Comparative Study Of Ai-generated (GPT-4) And Human-crafted Mcqs In Programming Education Jacob Doughty et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- Theory Of Mind For Multi-agent Collaboration Via Large Language Models Huao Li et al.
- Fingpt: Open-source Financial Large Language Models Hongyang Yang, Xiao-yang Liu, Christina Dan Wang
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Chatgpt For PLC/DCS Control Logic Generation Heiko Koziolek, Sten Gruener, Virendra Ashiwal
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Chatgpt Or Grammarly? Evaluating Chatgpt On Grammatical Error Correction Benchmark Haoran Wu, Wenxuan Wang, Yuxuan Wan, Wenxiang Jiao, Michael Lyu
- Visual Instruction Tuning Haotian Liu, Chunyuan Li, Qingyang Wu, Yong Jae Lee
- Autodroid: Llm-powered Task Automation In Android Hao Wen et al.
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Chain Of Hindsight Aligns Language Models With Feedback Hao Liu, Carmelo Sferrazza, Pieter Abbeel
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Prompting Large Language Models For Topic Modeling Han Wang et al.
- Llm-rec: Personalized Recommendation Via Prompting Large Language Models Hanjia Lyu et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Revisiting Large Language Models As Zero-shot Relation Extractors Guozheng Li, Peng Wang, Wenjun Ke
- The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, And Web Data Only Guilherme Penedo et al.
- Chatgpt Hallucinates When Attributing Answers Guido Zuccon, Bevan Koopman, Razia Shaik
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- On The Possibilities Of Ai-generated Text Detection Souradip Chakraborty et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Principled Instructions Are All You Need For Questioning Llama-1/2, GPT-3.5/4 Sondos Mahmoud Bsharat, Aidar Myrzakhan, Zhiqiang Shen
- Retrieving Supporting Evidence For Generative Question Answering Siqing Huo, Negar Arabzadeh, Charles L. A. Clarke
- Llm-empowered Chatbots For Psychiatrist And Patient Simulation: Application And Evaluation Siyuan Chen et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Prompt-based Distribution Alignment For Unsupervised Domain Adaptation Shuanghao Bai et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Boosting Theory-of-mind Performance In Large Language Models Via Prompting Shima Rahimi Moghaddam, Christopher J. Honey
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Language Is Not All You Need: Aligning Perception With Language Models Shaohan Huang et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Chatgpt As A Factual Inconsistency Evaluator For Text Summarization Zheheng Luo, Qianqian Xie, Sophia Ananiadou
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- Factscore: Fine-grained Atomic Evaluation Of Factual Precision In Long Form Text Generation Sewon Min et al.
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Palm 2 Technical Report Rohan Anil et al.
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- VELMA: Verbalization Embodiment Of LLM Agents For Vision And Language Navigation In Street View Raphael Schumann et al.
- Starcoder: May The Source Be With You! Raymond Li et al.
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Large Language Models Predict Human Sensory Judgments Across Six Modalities Raja Marjieh, Ilia Sucholutsky, Pol Van Rijn, Nori Jacoby, Thomas L. Griffiths
- Can We Trust The Evaluation On Chatgpt? Rachith Aiyappa, Jisun An, Haewoon Kwak, Yong-yeol Ahn
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- "it Felt Like Having A Second Mind": Investigating Human-ai Co-creativity In Prewriting With Large Language Models Qian Wan et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Graphologue: Exploring Large Language Model Responses With Interactive Diagrams Peiling Jiang, Jude Rayan, Steven P. Dow, Haijun Xia
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Going Beyond Nouns With Vision & Language Models Using Synthetic Data Paola Cascante-bonilla et al.
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Ontochatgpt Information System: Ontology-driven Structured Prompts For Chatgpt Meta-learning Oleksandr Palagin, Vladislav Kaverinskiy, Anna Litvin, Kyrylo Malakhov
- GPT-4 Technical Report Openai et al.
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- Datatales: Investigating The Use Of Large Language Models For Authoring Data-driven Articles Nicole Sultanum, Arjun Srinivasan
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Large Language Models Are Zero-shot Time Series Forecasters Nate Gruver, Marc Finzi, Shikai Qiu, Andrew Gordon Wilson
- Chatgpt MT: Competitive For High- (but Not Low-) Resource Languages Nathaniel R. Robinson, Perez Ogayo, David R. Mortensen, Graham Neubig
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Chatgpt Is A Knowledgeable But Inexperienced Solver: An Investigation Of Commonsense Problem In Large Language Models Ning Bian et al.
- Introducing Language Guidance In Prompt-based Continual Learning Muhammad Gul Zain Ali Khan et al.
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- A Stitch In Time Saves Nine: Detecting And Mitigating Hallucinations Of Llms By Validating Low-confidence Generation Neeraj Varshney, Wenlin Yao, Hongming Zhang, Jianshu Chen, Dong Yu
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Memorybank: Enhancing Large Language Models With Long-term Memory Wanjun Zhong, Lianghong Guo, Qiqi Gao, He Ye, Yanlin Wang
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Freshllms: Refreshing Large Language Models With Search Engine Augmentation Tu Vu et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Enabling Large Language Models To Generate Text With Citations Tianyu Gao, Howard Yen, Jiatong Yu, Danqi Chen
- Grounding Large Language Models In Interactive Environments With Online Reinforcement Learning Thomas Carta et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Deception Abilities Emerged In Large Language Models Thilo Hagendorff
- Mindfuldiary: Harnessing Large Language Model To Support Psychiatric Patients' Journaling Taewan Kim et al.
- Sparks Of Artificial General Intelligence: Early Experiments With GPT-4 Sébastien Bubeck et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Promptify: Text-to-image Generation Through Interactive Prompt Exploration With Large Language Models Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Chacha: Leveraging Large Language Models To Prompt Children To Share Their Emotions About Personal Events Woosuk Seo, Chanmo Yang, Young-ho Kim
- Language Models Represent Space And Time Wes Gurnee, Max Tegmark
- M3exam: A Multilingual, Multimodal, Multilevel Benchmark For Examining Large Language Models Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- Cogagent: A Visual Language Model For GUI Agents Wenyi Hong et al.
- BLIVA: A Simple Multimodal LLM For Better Handling Of Text-rich Visual Questions Wenbo Hu et al.
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- PMC-VQA: Visual Instruction Tuning For Medical Visual Question Answering Xiaoman Zhang et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- Language Models Can Solve Computer Tasks Geunwoo Kim, Pierre Baldi, Stephen Mcaleer
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Do Large Language Models Show Decision Heuristics Similar To Humans? A Case Study Using GPT-3.5 Gaurav Suri, Lily R. Slater, Ali Ziaee, Morgan Nguyen
- Lost In Translation: Large Language Models In Non-english Content Analysis Gabriel Nicholas, Aliya Bhatia
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Perspectives On Large Language Models For Relevance Judgment Guglielmo Faggioli et al.
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Evaluating Open-domain Question Answering In The Era Of Large Language Models Ehsan Kamalloo, Nouha Dziri, Charles L. A. Clarke, Davood Rafiei
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- Enhancing Retrieval-augmented Large Language Models With Iterative Retrieval-generation Synergy Zhihong Shao et al.
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- Chatgpt Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions Deyao Zhu et al.
- Minigpt-4: Enhancing Vision-language Understanding With Advanced Large Language Models Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, Mohamed Elhoseiny
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Weak-to-strong Generalization: Eliciting Strong Capabilities With Weak Supervision Collin Burns et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- Macaw-llm: Multi-modal Language Modeling With Image, Audio, Video, And Text Integration Chenyang Lyu et al.
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- A Study Of Generative Large Language Model For Medical Research And Healthcare Cheng Peng et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Memgpt: Towards Llms As Operating Systems Charles Packer et al.
- Dipping Plms Sauce: Bridging Structure And Text For Effective Knowledge Graph Completion Via Conditional Soft Prompting Chen Chen, Yufei Wang, Aixin Sun, Bing Li, Kwok-yan Lam
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- MME: A Comprehensive Evaluation Benchmark For Multimodal Large Language Models Chaoyou Fu et al.
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Does GPT-4 Pass The Turing Test? Cameron R. Jones, Benjamin K. Bergen
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Seed-bench-2: Benchmarking Multimodal Large Language Models Bohao Li et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- Friend Or Foe? Exploring The Implications Of Large Language Models On The Science System Benedikt Fecher, Marcel Hebing, Melissa Laufer, Jörg Pohle, Fabian Sofsky
- Bad Actor, Good Advisor: Exploring The Role Of Large Language Models In Fake News Detection Beizhe Hu et al.
- Expertprompting: Instructing Large Language Models To Be Distinguished Experts Benfeng Xu et al.
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Check Your Facts And Try Again: Improving Large Language Models With External Knowledge And Automated Feedback Baolin Peng et al.
- Code Llama: Open Foundation Models For Code Baptiste Rozière et al.
- Large Language Models In The Workplace: A Case Study On Prompt Engineering For Job Type Classification Benjamin Clavié, Alexandru Ciceu, Frederick Naylor, Guillaume Soulié, Thomas Brightwell
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- Exploring The Responses Of Large Language Models To Beginner Programmers' Help Requests Arto Hellas et al.
- The False Promise Of Imitating Proprietary Llms Arnav Gudibande et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- Interpretable Long-form Legal Question Answering With Retrieval-augmented Large Language Models Antoine Louis, Gijs Van Dijck, Gerasimos Spanakis
- Detecting And Preventing Hallucinations In Large Vision Language Models Anisha Gunjal, Jihan Yin, Erhan Bas
- Med-halt: Medical Domain Hallucination Test For Large Language Models Ankit Pal, Logesh Kumar Umapathi, Malaikannan Sankarasubbu
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- On The Application Of Large Language Models For Language Teaching And Assessment Technology Andrew Caines et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
- Calibrated Language Models Must Hallucinate Adam Tauman Kalai, Santosh S. Vempala
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- Analyzing And Mitigating Object Hallucination In Large Vision-language Models Yiyang Zhou et al.
- Powerinfer: Fast Large Language Model Serving With A Consumer-grade GPU Yixin Song, Zeyu Mi, Haotong Xie, Haibo Chen
- "kelly Is A Warm Person, Joseph Is A Role Model": Gender Biases In Llm-generated Reference Letters Yixin Wan et al.
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- 3D-LLM: Injecting The 3D World Into Large Language Models Yining Hong et al.
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- Human-centric Autonomous Systems With Llms For User Command Reasoning Yi Yang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Specializing Smaller Language Models Towards Multi-step Reasoning Yao Fu, Hao Peng, Litu Ou, Ashish Sabharwal, Tushar Khot
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Is Chatgpt A Good Sentiment Analyzer? A Preliminary Study Zengzhi Wang et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- Editing Large Language Models: Problems, Methods, And Opportunities Yunzhi Yao et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Kosmos-2: Grounding Multimodal Large Language Models To The World Zhiliang Peng et al.
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Roco: Dialectic Multi-robot Collaboration With Large Language Models Zhao Mandi, Shreeya Jain, Shuran Song
- Searching For Best Practices In Retrieval-augmented Generation Xiaohua Wang et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Assessing AI Detectors In Identifying Ai-generated Code: Implications For Education Wei Hung Pan et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Towards Conversational Diagnostic AI Tao Tu et al.
- Who Validates The Validators? Aligning Llm-assisted Evaluation Of LLM Outputs With Human Preferences Shreya Shankar, J. D. Zamfirescu-pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Ai-augmented Brainwriting: Investigating The Use Of Llms In Group Ideation Orit Shaer, Angelora Cooper, Osnat Mokryn, Andrew L. Kun, Hagit Ben Shoshan
- CBR-RAG: Case-based Reasoning For Retrieval Augmented Generation In Llms For Legal Question Answering Nirmalie Wiratunga et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Supporting Sensemaking Of Large Language Model Outputs At Scale Katy Ilonka Gero, Chelse Swoopes, Ziwei Gu, Jonathan K. Kummerfeld, Elena L. Glassman
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Clochat: Understanding How People Customize, Interact, And Experience Personas In Large Language Models Juhye Ha, Hyeon Jeon, Daeun Han, Jinwook Seo, Changhoon Oh
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Gemma: Open Models Based On Gemini Research And Technology Gemma Team et al.
- Gemini 1.5: Unlocking Multimodal Understanding Across Millions Of Tokens Of Context Gemini Team et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Understanding The Impact Of Long-term Memory On Self-disclosure With Large Language Model-driven Chatbots For Public Health Intervention Eunkyung Jo, Yuin Jeong, Sohyun Park, Daniel A. Epstein, Young-ho Kim
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Rethinking Interpretability In The Era Of Large Language Models Chandan Singh, Jeevana Priya Inala, Michel Galley, Rich Caruana, Jianfeng Gao
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Large Language Models And User Trust: Consequence Of Self-referential Learning Loop And The Deskilling Of Healthcare Professionals Avishek Choudhury, Zaria Chaudhry
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- AI And Memory Wall Amir Gholami et al.
- Optimization Methods For Personalizing Large Language Models Through Retrieval Augmentation Alireza Salemi, Surya Kallumadi, Hamed Zamani
- Autocoderover: Autonomous Program Improvement Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, Abhik Roychoudhury
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- CRUD-RAG: A Comprehensive Chinese Benchmark For Retrieval-augmented Generation Of Large Language Models Yuanjie Lyu et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- How Far Are We To GPT-4V? Closing The Gap To Commercial Multimodal Models With Open-source Suites Zhe Chen et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Can Generative Llms Create Query Variants For Test Collections? An Exploratory Study Marwah Alaofi, Luke Gallagher, Mark Sanderson, Falk Scholer, Paul Thomas
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
🏷 Responsible AI
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Challenges In Detoxifying Language Models Johannes Welbl et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- When To Make Exceptions: Exploring Language Models As Accounts Of Human Moral Judgment Zhijing Jin et al.
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Languagempc: Large Language Models As Decision Makers For Autonomous Driving Hao Sha et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Privacy In Large Language Models: Attacks, Defenses And Future Directions Haoran Li et al.
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Palm 2 Technical Report Rohan Anil et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Starcoder: May The Source Be With You! Raymond Li et al.
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- AI Transparency In The Age Of Llms: A Human-centered Research Roadmap Q. Vera Liao, Jennifer Wortman Vaughan
- Hallucinations In Large Multilingual Translation Models Nuno M. Guerreiro et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Adapted Large Language Models Can Outperform Medical Experts In Clinical Text Summarization Dave Van Veen et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Toxicity In Chatgpt: Analyzing Persona-assigned Language Models Ameet Deshpande, Vishvak Murahari, Tanmay Rajpurohit, Ashwin Kalyan, Karthik Narasimhan
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- Capabilities Of Gemini Models In Medicine Khaled Saab et al.
- Gemma: Open Models Based On Gemini Research And Technology Gemma Team et al.
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Farsight: Fostering Responsible AI Awareness During AI Application Prototyping Zijie J. Wang, Chinmay Kulkarni, Lauren Wilcox, Michael Terry, Michael Madaio
🏷 Scaling Laws
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Revisiting Neural Scaling Laws In Language And Vision Ibrahim Alabdulmohsin, Behnam Neyshabur, Xiaohua Zhai
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Large Language Models Are Zero-shot Reasoners Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- AI And Memory Wall Amir Gholami et al.
🏷 Security
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Evaluating Text Gans As Language Models Guy Tevet, Gavriel Habib, Vered Shwartz, Jonathan Berant
- Language Gans Falling Short Massimo Caccia et al.
- Adversarial Over-sensitivity And Over-stability Strategies For Dialogue Models Tong Niu, Mohit Bansal
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Say What I Want: Towards The Dark Side Of Neural Dialogue Models Haochen Liu, Tyler Derr, Zitao Liu, Jiliang Tang
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Learning To Retrieve Reasoning Paths Over Wikipedia Graph For Question Answering Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
- Generating Persona Consistent Dialogues By Exploiting Natural Language Inference Haoyu Song, Wei-nan Zhang, Jingwen Hu, Ting Liu
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Universal Adversarial Triggers For Attacking And Analyzing NLP Eric Wallace, Shi Feng, Nikhil Kandpal, Matt Gardner, Sameer Singh
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- ACUTE-EVAL: Improved Dialogue Evaluation With Optimized Questions And Multi-turn Comparisons Margaret Li, Jason Weston, Stephen Roller
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- BERT Loses Patience: Fast And Robust Inference With Early Exit Wangchunshu Zhou et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- Contextualized Perturbation For Textual Adversarial Attack Dianqi Li et al.
- A Simple Language Model For Task-oriented Dialogue Ehsan Hosseini-asl, Bryan Mccann, Chien-sheng Wu, Semih Yavuz, Richard Socher
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Contrastive Code Representation Learning Paras Jain et al.
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- An Empirical Study On Robustness To Spurious Correlations Using Pre-trained Language Models Lifu Tu, Garima Lalwani, Spandana Gella, He He
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Imitation Attacks And Defenses For Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Dynaboard: An Evaluation-as-a-service Platform For Holistic Next-generation Benchmarking Zhiyi Ma et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- The Power Of Scale For Parameter-efficient Prompt Tuning Brian Lester, Rami Al-rfou, Noah Constant
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- Defending Against Backdoor Attacks In Natural Language Generation Xiaofei Sun et al.
- Evaluating Large Language Models Trained On Code Mark Chen et al.
- Improving Question Answering Model Robustness With Synthetic Adversarial Data Generation Max Bartolo et al.
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Evaluating The Robustness Of Neural Language Models To Input Perturbations Milad Moradi, Matthias Samwald
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Maieutic Prompting: Logically Consistent Reasoning With Recursive Explanations Jaehun Jung et al.
- Teaching Language Models To Support Answers With Verified Quotes Jacob Menick et al.
- Exploring Visual Prompts For Adapting Large-scale Models Hyojin Bahng, Ali Jahanian, Swami Sankaranarayanan, Phillip Isola
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Are Large Pre-trained Language Models Leaking Your Personal Information? Jie Huang, Hanyin Shao, Kevin Chen-chuan Chang
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Complexity-based Prompting For Multi-step Reasoning Yao Fu, Hao Peng, Ashish Sabharwal, Peter Clark, Tushar Khot
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- Qaner: Prompting Question Answering Models For Few-shot Named Entity Recognition Andy T. Liu et al.
- Improving Alignment Of Dialogue Agents Via Targeted Human Judgements Amelia Glaese et al.
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Holistic Evaluation Of Language Models Percy Liang et al.
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Spear Phishing With Large Language Models Julian Hazell
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Benchmarking Large Language Models In Retrieval-augmented Generation Jiawei Chen, Hongyu Lin, Xianpei Han, Le Sun
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Ferret: Refer And Ground Anything Anywhere At Any Granularity Haoxuan You et al.
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Privacy In Large Language Models: Attacks, Defenses And Future Directions Haoran Li et al.
- Generating Phishing Attacks Using Chatgpt Sayak Saha Roy, Krishna Vamsi Naragam, Shirin Nilizadeh
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Audiogpt: Understanding And Generating Speech, Music, Sound, And Talking Head Rongjie Huang et al.
- Open Sesame! Universal Black Box Jailbreaking Of Large Language Models Raz Lapid, Ron Langberg, Moshe Sipper
- How Secure Is Code Generated By Chatgpt? Raphaël Khoury, Anderson R. Avila, Jacob Brunelle, Baba Mamadou Camara
- Prompting The Hidden Talent Of Web-scale Speech Models For Zero-shot Task Generalization Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Are Aligned Neural Networks Adversarially Aligned? Nicholas Carlini et al.
- Clever Hans Or Neural Theory Of Mind? Stress Testing Social Reasoning In Large Language Models Natalie Shapira et al.
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- State Of What Art? A Call For Multi-prompt LLM Evaluation Moran Mizrahi et al.
- The Troubling Emergence Of Hallucination In Large Language Models -- An Extensive Definition, Quantification, And Prescriptive Remediations Vipula Rawte et al.
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Is GPT-4 A Reliable Rater? Evaluating Consistency In GPT-4 Text Ratings Veronika Hackl, Alexandra Elena Müller, Michael Granitzer, Maximilian Sailer
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- RLHF-V: Towards Trustworthy Mllms Via Behavior Alignment From Fine-grained Correctional Human Feedback Tianyu Yu et al.
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Red Teaming Chatgpt Via Jailbreaking: Bias, Robustness, Reliability And Toxicity Terry Yue Zhuo, Yujin Huang, Chunyang Chen, Zhenchang Xing
- Observations On Llms For Telecom Domain: Capabilities And Limitations Sumit Soman, Ranjani H G
- Is Chatgpt A Good Translator? Yes With GPT-4 As The Engine Wenxiang Jiao et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Visual Adversarial Examples Jailbreak Aligned Large Language Models Xiangyu Qi et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Unveiling Security, Privacy, And Ethical Concerns Of Chatgpt Xiaodong Wu, Ran Duan, Jianbing Ni
- Preference Ranking Optimization For Human Alignment Feifan Song et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Exploiting Programmatic Behavior Of Llms: Dual-use Through Standard Security Attacks Daniel Kang et al.
- Can Large Language Models Be An Alternative To Human Evaluations? Cheng-han Chiang, Hung-yi Lee
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- How Close Is Chatgpt To Human Experts? Comparison Corpus, Evaluation, And Detection Biyang Guo et al.
- Universal And Transferable Adversarial Attacks On Aligned Language Models Andy Zou et al.
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- A Categorical Archive Of Chatgpt Failures Ali Borji
- The (ab)use Of Open Source Code To Train Large Language Models Ali Al-kaswan, Maliheh Izadi
- Smoothllm: Defending Large Language Models Against Jailbreaking Attacks Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- How Robust Is GPT-3.5 To Predecessors? A Comprehensive Study On Language Understanding Tasks Xuanting Chen et al.
- On Evaluating Adversarial Robustness Of Large Vision-language Models Yunqing Zhao et al.
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- In Chatgpt We Trust? Measuring And Characterizing The Reliability Of Chatgpt Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Trustllm: Trustworthiness In Large Language Models Yue Huang et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- How Johnny Can Persuade Llms To Jailbreak Them: Rethinking Persuasion To Challenge AI Safety By Humanizing Llms Yi Zeng et al.
🏷 SLT
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
🏷 Survey Paper
- Why Are Sequence-to-sequence Models So Dull? Understanding The Low-diversity Problem Of Chatbots Shaojie Jiang, Maarten De Rijke
- DP-GAN: Diversity-promoting Generative Adversarial Network For Generating Informative And Diversified Text Jingjing Xu, Xuancheng Ren, Junyang Lin, Xu Sun
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- A Survey Of Natural Language Generation Techniques With A Focus On Dialogue Systems - Past, Present And Future Directions Sashank Santhanam, Samira Shaikh
- Deep Learning Based Chatbot Models Richard Csaky
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- A Survey Of Knowledge-enhanced Text Generation Wenhao Yu et al.
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Characterchat: Supporting The Creation Of Fictional Characters Through Conversation And Progressive Manifestation With A Chatbot Oliver Schmitt, Daniel Buschek
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- Language Models As Agent Models Jacob Andreas
- A Survey On Retrieval-augmented Text Generation Huayang Li, Yixuan Su, Deng Cai, Yan Wang, Lemao Liu
- Reasoning With Language Model Prompting: A Survey Shuofei Qiao et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- Language Generation Models Can Cause Harm: So What Can We Do About It? An Actionable Survey Sachin Kumar, Vidhisha Balachandran, Lucille Njoo, Antonios Anastasopoulos, Yulia Tsvetkov
- Coditt5: Pretraining For Source Code And Natural Language Editing Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric
- Towards Reasoning In Large Language Models: A Survey Jie Huang, Kevin Chen-chuan Chang
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- The Debate Over Understanding In Ai's Large Language Models Melanie Mitchell, David C. Krakauer
- Shortcut Learning Of Large Language Models In Natural Language Understanding Mengnan Du, Fengxiang He, Na Zou, Dacheng Tao, Xia Hu
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Gptaraeval: A Comprehensive Evaluation Of Chatgpt On Arabic NLP Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, Muhammad Abdul-mageed
- Co-writing With Opinionated Language Models Affects Users' Views Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson, Mor Naaman
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- A Bibliometric Review Of Large Language Models Research From 2017 To 2023 Lizhou Fan et al.
- Practical And Ethical Challenges Of Large Language Models In Education: A Systematic Scoping Review Lixiang Yan et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- A Survey On Hallucination In Large Language Models: Principles, Taxonomy, Challenges, And Open Questions Lei Huang et al.
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- A Systematic Survey Of Prompt Engineering On Vision-language Foundation Models Jindong Gu et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- On The Robustness Of Chatgpt: An Adversarial And Out-of-distribution Perspective Jindong Wang et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- "it's Not Like Jarvis, But It's Pretty Close!" -- Examining Chatgpt's Usage Among Undergraduate Students In Computer Science Ishika Joshi, Ritvik Budhiraja, Harshal D Akolekar, Jagat Sesh Challa, Dhruv Kumar
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- Augmented Language Models: A Survey Grégoire Mialon et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Are Emergent Abilities Of Large Language Models A Mirage? Rylan Schaeffer, Brando Miranda, Sanmi Koyejo
- Medalign: A Clinician-generated Dataset For Instruction Following With Electronic Medical Records Scott L. Fleming et al.
- The Science Of Detecting Llm-generated Texts Ruixiang Tang, Yu-neng Chuang, Xia Hu
- Gpteval: A Survey On Assessments Of Chatgpt And GPT-4 Rui Mao, Guanyi Chen, Xulang Zhang, Frank Guerin, Erik Cambria
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Evaluation Of Chatgpt-generated Medical Responses: A Systematic Review And Meta-analysis Qiuhong Wei et al.
- Can Large Language Models Replace Humans In The Systematic Review Process? Evaluating Gpt-4's Efficacy In Screening And Extracting Data From Peer-reviewed And Grey Literature In Multiple Languages Qusai Khraisha, Sophie Put, Johanna Kappenberg, Azza Warraitch, Kristin Hadfield
- Retrieving Multimodal Information For Augmented Generation: A Survey Ruochen Zhao et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Large Language Model Alignment: A Survey Tianhao Shen et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Survey Of Vulnerabilities In Large Language Models Revealed By Adversarial Attacks Erfan Shayegani et al.
- Simulating H.P. Lovecraft Horror Literature With The Chatgpt Large Language Model Eduardo C. Garrido-merchán, José Luis Arroyo-barrigüete, Roberto Gozalo-brizuela
- A Short Survey Of Viewing Large Language Models In Legal Aspect Zhongxiang Sun
- Large Language Models For Generative Information Extraction: A Survey Derong Xu et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- One Small Step For Generative AI, One Giant Leap For AGI: A Complete Survey On Chatgpt In AIGC Era Chaoning Zhang et al.
- Large Language Models On Graphs: A Comprehensive Survey Bowen Jin et al.
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Generative AI: Implications And Applications For Education Anastasia Olnancy Olga et al.
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- A Survey On Large Language Model (LLM) Security And Privacy: The Good, The Bad, And The Ugly Yifan Yao et al.
- A Comprehensive Survey Of Ai-generated Content (AIGC): A History Of Generative AI From GAN To Chatgpt Yihan Cao et al.
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Trustworthy Llms: A Survey And Guideline For Evaluating Large Language Models' Alignment Yang Liu et al.
- A Survey On Model Compression For Large Language Models Xunyu Zhu, Jian Li, Yong Liu, Can Ma, Weiping Wang
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Large Language Models In Healthcare And Medical Domain: A Review Zabir Al Nazi, Wei Peng
- Vision Language Models In Autonomous Driving: A Survey And Outlook Xingcheng Zhou et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Monitoring Ai-modified Content At Scale: A Case Study On The Impact Of Chatgpt On AI Conference Peer Reviews Weixin Liang et al.
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Mapping The Ethics Of Generative AI: A Comprehensive Scoping Review Thilo Hagendorff
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Large Language Models And Games: A Survey And Roadmap Roberto Gallotta et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- AI Hallucinations: A Misnomer Worth Clarifying Negar Maleki, Balaji Padmanabhan, Kaushik Dutta
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Large Language Models In Cybersecurity: State-of-the-art Farzad Nourmohammadzadeh Motlagh et al.
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- A Review Of Modern Recommender Systems Using Generative Models (gen-recsys) Yashar Deldjoo et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
🏷 TACL
🏷 Tokenization
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Leveraging Large Language Models For Multiple Choice Question Answering Joshua Robinson, Christopher Michael Rytting, David Wingate
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Make-a-scene: Scene-based Text-to-image Generation With Human Priors Oran Gafni et al.
- Video-llava: Learning United Visual Representation By Alignment Before Projection Bin Lin et al.
- Language Model Tokenizers Introduce Unfairness Between Languages Aleksandar Petrov, Emanuele La Malfa, Philip H. S. Torr, Adel Bibi
🏷 Tools
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Topic Aware Neural Response Generation Chen Xing et al.
- A User Simulator For Task-completion Dialogues Xiujun Li et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- End-to-end Optimization Of Goal-driven And Visually Grounded Dialogue Systems Florian Strub et al.
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Neural Personalized Response Generation As Domain Adaptation Weinan Zhang, Ting Liu, Yifa Wang, Qingfu Zhu
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Latent Intention Dialogue Models Tsung-hsien Wen, Yishu Miao, Phil Blunsom, Steve Young
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Sequence-to-sequence Learning For Task-oriented Dialogue With Dialogue State Representation Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu
- Language Gans Falling Short Massimo Caccia et al.
- Seq2rdf: An End-to-end Application For Deriving Triples From Natural Language Text Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. Mcguinness
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- Advancing The State Of The Art In Open Domain Dialog Systems Through The Alexa Prize Chandra Khatri et al.
- A Retrieve-and-edit Framework For Predicting Structured Outputs Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Skeleton-to-response: Dialogue Generation Guided By Retrieval Memory Deng Cai et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- Controllable Neural Story Plot Generation Via Reward Shaping Pradyumna Tambwekar et al.
- Building A Conversational Agent Overnight With Dialogue Self-play Pararth Shah et al.
- Babyai: A Platform To Study The Sample Efficiency Of Grounded Language Learning Maxime Chevalier-boisvert et al.
- Conversational AI: The Science Behind The Alexa Prize Ashwin Ram et al.
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Generating Empathetic Responses By Looking Ahead The User's Sentiment Jamin Shin, Peng Xu, Andrea Madotto, Pascale Fung
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Using Natural Language For Reward Shaping In Reinforcement Learning Prasoon Goyal, Scott Niekum, Raymond J. Mooney
- Multi-step Retriever-reader Interaction For Scalable Open-domain Question Answering Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew Mccallum
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Approximating Interactive Human Evaluation With Self-play For Open-domain Dialog Systems Asma Ghandeharioun et al.
- Pythia: Ai-assisted Code Completion System Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- Reinforced Dynamic Reasoning For Conversational Question Generation Boyuan Pan, Hao Li, Ziyu Yao, Deng Cai, Huan Sun
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- A Modular Task-oriented Dialogue System Using A Neural Mixture-of-experts Jiahuan Pei, Pengjie Ren, Maarten De Rijke
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Towards Scalable Multi-domain Conversational Agents: The Schema-guided Dialogue Dataset Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Masked Language Model Scoring Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff
- Empdg: Multiresolution Interactive Empathetic Dialogue Generation Qintong Li et al.
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Machine Reading Comprehension: The Role Of Contextualized Language Models And Beyond Zhuosheng Zhang, Hai Zhao, Rui Wang
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- STORIUM: A Dataset And Evaluation Platform For Machine-in-the-loop Story Generation Nader Akoury et al.
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Meaningful Answer Generation Of E-commerce Question-answering Shen Gao, Xiuying Chen, Zhaochun Ren, Dongyan Zhao, Rui Yan
- Enabling Language Models To Fill In The Blanks Chris Donahue, Mina Lee, Percy Liang
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- You Impress Me: Dialogue Generation Via Mutual Persona Perception Qian Liu et al.
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Neural Generation Meets Real People: Towards Emotionally Engaging Mixed-initiative Conversations Ashwin Paranjape et al.
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Grounding Language To Autonomously-acquired Skills Via Goal Generation Ahmed Akakzia, Cédric Colas, Pierre-yves Oudeyer, Mohamed Chetouani, Olivier Sigaud
- Schema-guided Dialogue State Tracking Task At DSTC8 Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- MEGATRON-CNTRL: Controllable Story Generation With External Knowledge Using Large-scale Language Models Peng Xu et al.
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- Trojaning Language Models For Fun And Profit Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Trading Off Diversity And Quality In Natural Language Generation Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- GO FIGURE: A Meta Evaluation Of Factuality In Summarization Saadia Gabriel, Asli Celikyilmaz, Rahul Jha, Yejin Choi, Jianfeng Gao
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- PLATO-2: Towards Building An Open-domain Chatbot Via Curriculum Learning Siqi Bao et al.
- Template Guided Text Generation For Task-oriented Dialogue Mihir Kale, Abhinav Rastogi
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- The Language Interpretability Tool: Extensible, Interactive Visualizations And Analysis For NLP Models Ian Tenney et al.
- Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight Hengyi Cai et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Dynaboard: An Evaluation-as-a-service Platform For Holistic Next-generation Benchmarking Zhiyi Ma et al.
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Codexglue: A Machine Learning Benchmark Dataset For Code Understanding And Generation Shuai Lu et al.
- Unifying Vision-and-language Tasks Via Text Generation Jaemin Cho, Jie Lei, Hao Tan, Mohit Bansal
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- Revealing Persona Biases In Dialogue Systems Emily Sheng, Josh Arnold, Zhou Yu, Kai-wei Chang, Nanyun Peng
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- A Short Survey Of Pre-trained Language Models For Conversational AI-A Newage In NLP Munazza Zaib, Quan Z. Sheng, Wei Emma Zhang
- Efficient Retrieval Augmented Generation From Unstructured Knowledge For Task-oriented Dialog David Thulke, Nico Daheim, Christian Dugast, Hermann Ney
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Language Model Evaluation Beyond Perplexity Clara Meister, Ryan Cotterell
- Empowering News Recommendation With Pre-trained Language Models Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang
- One Teacher Is Enough? Pre-trained Language Model Distillation From Multiple Teachers Chuhan Wu, Fangzhao Wu, Yongfeng Huang
- Newsbert: Distilling Pre-trained Language Model For Intelligent News Application Chuhan Wu et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- LFPT5: A Unified Framework For Lifelong Few-shot Language Learning Based On Prompt Tuning Of T5 Chengwei Qin, Shafiq Joty
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Bartscore: Evaluating Generated Text As Text Generation Weizhe Yuan, Graham Neubig, Pengfei Liu
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Summ^n: A Multi-stage Summarization Framework For Long Input Dialogues And Documents Yusen Zhang et al.
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- Multilingual LAMA: Investigating Knowledge In Multilingual Pretrained Language Models Nora Kassner, Philipp Dufter, Hinrich Schütze
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- Openprompt: An Open-source Framework For Prompt-learning Ning Ding et al.
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Pre-train, Prompt, And Predict: A Systematic Survey Of Prompting Methods In Natural Language Processing Pengfei Liu et al.
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- Constrained Language Models Yield Few-shot Semantic Parsers Richard Shin et al.
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- E-vil: A Dataset And Benchmark For Natural Language Explanations In Vision-language Tasks Maxime Kayser et al.
- Multimodal Few-shot Learning With Frozen Language Models Maria Tsimpoukelli et al.
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Retrieval Augmented Code Generation And Summarization Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Few-shot Conversational Dense Retrieval Shi Yu, Zhenghao Liu, Chenyan Xiong, Tao Feng, Zhiyuan Liu
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- React: Synergizing Reasoning And Acting In Language Models Shunyu Yao et al.
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Selective Annotation Makes Language Models Better Few-shot Learners Hongjin Su et al.
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Instruction Tuning For Few-shot Aspect-based Sentiment Analysis Siddharth Varia et al.
- Teaching Algorithmic Reasoning Via In-context Learning Hattie Zhou et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Promptsource: An Integrated Development Environment And Repository For Natural Language Prompts Stephen H. Bach et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Lost At C: A User Study On The Security Implications Of Large Language Model Code Assistants Gustavo Sandoval et al.
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Self-adaptive In-context Learning: An Information Compression Perspective For In-context Example Selection And Ordering Zhiyong Wu, Yaoxiang Wang, Jiacheng Ye, Lingpeng Kong
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Toxigen: A Large-scale Machine-generated Dataset For Adversarial And Implicit Hate Speech Detection Thomas Hartvigsen et al.
- Unifiedskg: Unifying And Multi-tasking Structured Knowledge Grounding With Text-to-text Language Models Tianbao Xie et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Re3: Generating Longer Stories With Recursive Reprompting And Revision Kevin Yang, Yuandong Tian, Nanyun Peng, Dan Klein
- Automatic Generation Of Programming Exercises And Code Explanations Using Large Language Models Sami Sarsa, Paul Denny, Arto Hellas, Juho Leinonen
- Action-gpt: Leveraging Large-scale Language Models For Improved And Generalized Action Generation Sai Shashank Kalakonda, Shubh Maheshwari, Ravi Kiran Sarvadevabhatla
- Large Language Models Encode Clinical Knowledge Karan Singhal et al.
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Chatgpt: The End Of Online Exam Integrity? Teo Susnjak
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Confident Adaptive Language Modeling Tal Schuster et al.
- Flamingo: A Visual Language Model For Few-shot Learning Jean-baptiste Alayrac et al.
- Dualprompt: Complementary Prompting For Rehearsal-free Continual Learning Zifeng Wang et al.
- Decomposed Prompting: A Modular Approach For Solving Complex Tasks Tushar Khot et al.
- Who Is GPT-3? An Exploration Of Personality, Values And Demographics Marilù Miotto, Nicola Rossberg, Bennett Kleinberg
- Dialfred: Dialogue-enabled Agents For Embodied Instruction Following Xiaofeng Gao et al.
- Prompting Is Programming: A Query Language For Large Language Models Luca Beurer-kellner, Marc Fischer, Martin Vechev
- Black-box Tuning For Language-model-as-a-service Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Towards A Unified Multi-dimensional Evaluator For Text Generation Ming Zhong et al.
- Evaluating Human-language Model Interaction Mina Lee et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- Rationale-augmented Ensembles In Language Models Xuezhi Wang et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Language And Culture Internalisation For Human-like Autotelic AI Cédric Colas, Tristan Karch, Clément Moulin-frier, Pierre-yves Oudeyer
- DS-1000: A Natural And Reliable Benchmark For Data Science Code Generation Yuhang Lai et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- A Unified Multi-task Learning Framework For Multi-goal Conversational Recommender Systems Yang Deng et al.
- In-context Learning For Few-shot Dialogue State Tracking Yushi Hu et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Iteratively Prompt Pre-trained Language Models For Chain Of Thought Boshi Wang, Xiang Deng, Huan Sun
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Attributed Question Answering: Evaluation And Modeling For Attributed Large Language Models Bernd Bohnet et al.
- Multi-lingual Evaluation Of Code Generation Models Ben Athiwaratkun et al.
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Don't Generate, Discriminate: A Proposal For Grounding Language Models To Real-world Environments Yu Gu, Xiang Deng, Yu Su
- Grips: Gradient-free, Edit-based Instruction Search For Prompting Large Language Models Archiki Prasad, Peter Hase, Xiang Zhou, Mohit Bansal
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Socratic Models: Composing Zero-shot Multimodal Reasoning With Language Andy Zeng et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Commonsenseqa 2.0: Exposing The Limits Of AI Through Gamification Alon Talmor et al.
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Prompting GPT-3 To Be Reliable Chenglei Si et al.
- Solving Quantitative Reasoning Problems With Language Models Aitor Lewkowycz et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- What Is It Like To Program With Artificial Intelligence? Advait Sarkar et al.
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Code Generation Tools (almost) For Free? A Study Of Few-shot, Pre-trained Language Models On Code Patrick Bareiß, Beatriz Souza, Marcelo D'amorim, Michael Pradel
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Co-writing Screenplays And Theatre Scripts With Language Models: An Evaluation By Industry Professionals Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, Richard Evans
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Demonstrate-search-predict: Composing Retrieval And Language Models For Knowledge-intensive NLP Omar Khattab et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- 3DALL-E: Integrating Text-to-image AI In 3D Design Workflows Vivian Liu, Jo Vermeulen, George Fitzmaurice, Justin Matejka
- Talking About Large Language Models Murray Shanahan
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Time-llm: Time Series Forecasting By Reprogramming Large Language Models Ming Jin et al.
- Selenite: Scaffolding Online Sensemaking With Comprehensive Overviews Elicited From Large Language Models Michael Xieyang Liu et al.
- Detecting Llm-generated Text In Computing Education: A Comparative Study For Chatgpt Cases Michael Sheinman Orenstrakh, Oscar Karnalim, Carlos Anibal Suarez, Michael Liut
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Quantifying Language Models' Sensitivity To Spurious Features In Prompt Design Or: How I Learned To Start Worrying About Prompt Formatting Melanie Sclar, Yejin Choi, Yulia Tsvetkov, Alane Suhr
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- Leancontext: Cost-efficient Domain-specific Question Answering Using Llms Md Adnan Arefeen, Biplob Debnath, Srimat Chakradhar
- CRITIC: Large Language Models Can Self-correct With Tool-interactive Critiquing Zhibin Gou et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Nl2spec: Interactively Translating Unstructured Natural Language To Temporal Logics With Large Language Models Matthias Cosler, Christopher Hahn, Daniel Mendoza, Frederik Schmitt, Caroline Trippel
- Enhancing CLIP With GPT-4: Harnessing Visual Descriptions As Prompts Mayug Maniparambil et al.
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- From Word Models To World Models: Translating From Natural Language To The Probabilistic Language Of Thought Lionel Wong et al.
- Generative Artificial Intelligence In Learning Analytics: Contextualising Opportunities And Challenges Through The Learning Analytics Cycle Lixiang Yan, Roberto Martinez-maldonado, Dragan Gašević
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Leveraging Pre-trained Large Language Models To Construct And Utilize World Models For Model-based Task Planning Lin Guan, Karthik Valmeekam, Sarath Sreedharan, Subbarao Kambhampati
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Judging Llm-as-a-judge With Mt-bench And Chatbot Arena Lianmin Zheng et al.
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Superclue: A Comprehensive Chinese Large Language Model Benchmark Liang Xu et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- 14 Examples Of How Llms Can Transform Materials Science And Chemistry: A Reflection On A Large Language Model Hackathon Kevin Maik Jablonka et al.
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Domain-specific Chatbots For Science Using Embeddings Kevin G. Yager
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- LLM In A Flash: Efficient Large Language Model Inference With Limited Memory Keivan Alizadeh et al.
- Chatgpt Chemistry Assistant For Text Mining And Prediction Of MOF Synthesis Zhiling Zheng, Oufan Zhang, Christian Borgs, Jennifer T. Chayes, Omar M. Yaghi
- Evaluating Language Models For Mathematics Through Interactions Katherine M. Collins et al.
- Waffling Around For Performance: Visual Classification With Random Words And Broad Concepts Karsten Roth et al.
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Towards Expert-level Medical Question Answering With Large Language Models Karan Singhal et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- Speechprompt V2: Prompt Tuning For Speech Classification Tasks Kai-wei Chang et al.
- Not What You've Signed Up For: Compromising Real-world Llm-integrated Applications With Indirect Prompt Injection Kai Greshake et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Logic-lm: Empowering Large Language Models With Symbolic Solvers For Faithful Logical Reasoning Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- Ai-augmented Surveys: Leveraging Large Language Models And Surveys For Opinion Prediction Junsol Kim, Byungkyu Lee
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Agentcf: Collaborative Learning With Autonomous Language Agents For Recommender Systems Junjie Zhang et al.
- Recommendation As Instruction Following: A Large Language Model Empowered Recommendation Approach Junjie Zhang et al.
- Evaluating GPT-4 And Chatgpt On Japanese Medical Licensing Examinations Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev
- Structgpt: A General Framework For Large Language Model To Reason Over Structured Data Jinhao Jiang et al.
- Gptscore: Evaluate As You Desire Jinlan Fu, See-kiong Ng, Zhengbao Jiang, Pengfei Liu
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Ethical Chatgpt: Concerns, Challenges, And Commandments Jianlong Zhou, Heimo Müller, Andreas Holzinger, Fang Chen
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- The Impact Of Chatgpt And Llms On Medical Imaging Stakeholders: Perspectives And Use Cases Jiancheng Yang, Hongwei Bran Li, Donglai Wei
- Onellm: One Framework To Align All Modalities With Language Jiaming Han et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Large Language Models In Medicine: The Potentials And Pitfalls Jesutofunmi A. Omiye, Haiwen Gui, Shawheen J. Rezaei, James Zou, Roxana Daneshjou
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Artificial Muses: Generative Artificial Intelligence Chatbots Have Risen To Human-level Creativity Jennifer Haase, Paul H. P. Hanel
- Prompting Is Not A Substitute For Probability Measurements In Large Language Models Jennifer Hu, Roger Levy
- MEGA: Multilingual Evaluation Of Generative AI Kabir Ahuja et al.
- Multimodal Chatgpt For Medical Applications: An Experimental Study Of GPT-4V Zhiling Yan et al.
- The Robots Are Here: Navigating The Generative AI Revolution In Computing Education James Prather et al.
- Chainforge: A Visual Toolkit For Prompt Engineering And LLM Hypothesis Testing Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Factuality Challenges In The Era Of Large Language Models Isabelle Augenstein et al.
- Building Cooperative Embodied Agents Modularly With Large Language Models Hongxin Zhang et al.
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Can Chatgpt Assess Human Personalities? A General Evaluation Framework Haocong Rao, Cyril Leung, Chunyan Miao
- Safety Assessment Of Chinese Large Language Models Hao Sun, Zhexin Zhang, Jiawen Deng, Jiale Cheng, Minlie Huang
- Lmdrive: Closed-loop End-to-end Driving With Large Language Models Hao Shao et al.
- Reasoning Implicit Sentiment With Chain-of-thought Prompting Hao Fei et al.
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- Applying Large Language Models And Chain-of-thought For Automatic Scoring Gyeong-geon Lee, Ehsan Latif, Xuansheng Wu, Ninghao Liu, Xiaoming Zhai
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Augmented Language Models: A Survey Grégoire Mialon et al.
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- Chatgpt Perpetuates Gender Bias In Machine Translation And Ignores Non-gendered Pronouns: Findings Across Bengali And Five Other Low-resource Languages Sourojit Ghosh, Aylin Caliskan
- Metagpt: Meta Programming For A Multi-agent Collaborative Framework Sirui Hong et al.
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- Mind Meets Machine: Unravelling Gpt-4's Cognitive Psychology Sifatkaur Dhingra, Manmeet Singh, Vaisakh Sb, Neetiraj Malviya, Sukhpal Singh Gill
- Do Generative Large Language Models Need Billions Of Parameters? Sia Gholami, Marwan Omar
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- Tree Of Thoughts: Deliberate Problem Solving With Large Language Models Shunyu Yao et al.
- Opportunities And Challenges For Chatgpt And Large Language Models In Biomedicine And Health Shubo Tian et al.
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- VISAR: A Human-ai Argumentative Writing Assistant With Visual Programming And Rapid Draft Prototyping Zheng Zhang, Jie Gao, Ranjodh Singh Dhaliwal, Toby Jia-jun Li
- Toolkengpt: Augmenting Frozen Language Models With Massive Tools Via Tool Embeddings Shibo Hao, Tianyang Liu, Zhen Wang, Zhiting Hu
- Reasoning With Language Model Is Planning With World Model Shibo Hao et al.
- Llm-in-the-loop: Leveraging Large Language Model For Thematic Analysis Shih-chieh Dai, Aiping Xiong, Lun-wei Ku
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Ragas: Automated Evaluation Of Retrieval Augmented Generation Shahul Es, Jithin James, Luis Espinosa-anke, Steven Schockaert
- Gorilla: Large Language Model Connected With Massive Apis Shishir G. Patil, Tianjun Zhang, Xin Wang, Joseph E. Gonzalez
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Luminate: Structured Generation And Exploration Of Design Space With Large Language Models For Human-ai Co-creation Sangho Suh, Meng Chen, Bryan Min, Toby Jia-jun Li, Haijun Xia
- Ai-assisted Coding: Experiments With GPT-4 Russell A Poldrack, Thomas Lu, Gašper Beguš
- AI, Write An Essay For Me: A Large-scale Comparison Of Human-written Versus Chatgpt-generated Essays Steffen Herbold, Annette Hautli-janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Chatgpt Vs. Google: A Comparative Study Of Search Performance And User Experience Ruiyun Rayna Xu, Yue Katherine Feng, Hailiang Chen
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Llm-assisted Content Analysis: Using Large Language Models To Support Deductive Coding Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Starcoder: May The Source Be With You! Raymond Li et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Codegeex: A Pre-trained Model For Code Generation With Multilingual Benchmarking On Humaneval-x Qinkai Zheng et al.
- Faithful Chain-of-thought Reasoning Qing Lyu et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Designerly Understanding: Information Needs For Model Transparency To Support Design Ideation For Ai-powered User Experience Q. Vera Liao, Hariharan Subramonyam, Jennifer Wang, Jennifer Wortman Vaughan
- Genegpt: Augmenting Large Language Models With Domain Tools For Improved Access To Biomedical Information Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu
- Autogen: Enabling Next-gen LLM Applications Via Multi-agent Conversation Qingyun Wu et al.
- Regulating Chatgpt And Other Large Generative AI Models Philipp Hacker, Andreas Engel, Marco Mauer
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Chat-univi: Unified Visual Representation Empowers Large Language Models With Image And Video Understanding Peng Jin, Ryuichi Takanobu, Wancai Zhang, Xiaochun Cao, Li Yuan
- Chameleon: Plug-and-play Compositional Reasoning With Large Language Models Pan Lu et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Dspy: Compiling Declarative Language Model Calls Into Self-improving Pipelines Omar Khattab et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Self-contradictory Hallucinations Of Large Language Models: Evaluation, Detection And Mitigation Niels Mündler, Jingxuan He, Slobodan Jenko, Martin Vechev
- Consistency Analysis Of Chatgpt Myeongjun Erik Jang, Thomas Lukasiewicz
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Video-chatgpt: Towards Detailed Video Understanding Via Large Vision And Language Models Muhammad Maaz, Hanoona Rasheed, Salman Khan, Fahad Shahbaz Khan
- Benefits And Harms Of Large Language Models In Digital Mental Health Munmun De Choudhury, Sachin R. Pendse, Neha Kumar
- Abscribe: Rapid Exploration & Organization Of Multiple Writing Variations In Human-ai Co-writing Tasks Using Large Language Models Mohi Reza et al.
- Verify-and-edit: A Knowledge-enhanced Chain-of-thought Framework Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, Lidong Bing
- LIDA: A Tool For Automatic Generation Of Grammar-agnostic Visualizations And Infographics Using Large Language Models Victor Dibia
- Can Ai-generated Text Be Reliably Detected? Vinu Sankar Sadasivan, Aounon Kumar, Sriram Balasubramanian, Wenxiao Wang, Soheil Feizi
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Fully Autonomous Programming With Large Language Models Vadim Liventsev, Anastasiia Grishina, Aki Härmä, Leon Moonen
- Generative AI For Programming Education: Benchmarking Chatgpt, GPT-4, And Human Tutors Tung Phung et al.
- Automating Human Tutor-style Programming Feedback: Leveraging GPT-4 Tutor Model For Hint Generation And GPT-3.5 Student Model For Hint Validation Tung Phung et al.
- Psy-llm: Scaling Up Global Mental Health Psychological Services With Ai-based Large Language Models Tin Lai et al.
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Encouraging Divergent Thinking In Large Language Models Through Multi-agent Debate Tian Liang et al.
- Cognitive Architectures For Language Agents Theodore R. Sumers, Shunyu Yao, Karthik Narasimhan, Thomas L. Griffiths
- Caption Anything: Interactive Image Description With Diverse Multimodal Controls Teng Wang et al.
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- Large Language Models In Education: Vision And Opportunities Wensheng Gan, Zhenlian Qi, Jiayang Wu, Jerry Chun-wei Lin
- Can Large Language Models Provide Useful Feedback On Research Papers? A Large-scale Empirical Analysis Weixin Liang et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Mm-vet: Evaluating Large Multimodal Models For Integrated Capabilities Weihao Yu et al.
- Llmrec: Large Language Models With Graph Augmentation For Recommendation Wei Wei et al.
- GPT Detectors Are Biased Against Non-native English Writers Weixin Liang, Mert Yuksekgonul, Yining Mao, Eric Wu, James Zou
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- Deceptive AI Ecosystems: The Case Of Chatgpt Xiao Zhan, Yifan Xu, Stefan Sarkadi
- Rethinking The Evaluation For Conversational Recommendation In The Era Of Large Language Models Xiaolei Wang, Xinyu Tang, Wayne Xin Zhao, Jingyuan Wang, Ji-rong Wen
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- Repocoder: Repository-level Code Completion Through Iterative Retrieval And Generation Fengji Zhang et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Chatkbqa: A Generate-then-retrieve Framework For Knowledge Base Question Answering With Fine-tuned Large Language Models Haoran Luo et al.
- Chatgpt Outperforms Crowd-workers For Text-annotation Tasks Fabrizio Gilardi, Meysam Alizadeh, Maël Kubli
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Assigning AI: Seven Approaches For Students, With Prompts Ethan Mollick, Lilach Mollick
- Exploring Human-like Translation Strategy With Large Language Models Zhiwei He et al.
- Gptutor: A Chatgpt-powered Programming Tool For Code Explanation Eason Chen, Ray Huang, Han-shin Chen, Yuen-hsien Tseng, Liang-yi Li
- Llm-blender: Ensembling Large Language Models With Pairwise Ranking And Generative Fusion Dongfu Jiang, Xiang Ren, Bill Yuchen Lin
- Using An LLM To Help With Code Understanding Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, Brad Myers
- REFINER: Reasoning Feedback On Intermediate Representations Debjit Paul et al.
- Exploiting Programmatic Behavior Of Llms: Dual-use Through Standard Security Attacks Daniel Kang et al.
- Almanac: Retrieval-augmented Language Models For Clinical Medicine Cyril Zakka et al.
- AI And The FCI: Can Chatgpt Project An Understanding Of Introductory Physics? Colin G. West
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- Opportunities And Risks Of Llms For Scalable Deliberation With Polis Christopher T. Small et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- Distilled GPT For Source Code Summarization Chia-yi Su, Collin Mcmillan
- Chateval: Towards Better Llm-based Evaluators Through Multi-agent Debate Chi-min Chan et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Supporting Human-ai Collaboration In Auditing Llms With Llms Charvi Rastogi, Marco Tulio Ribeiro, Nicholas King, Harsha Nori, Saleema Amershi
- Chatdev: Communicative Agents For Software Development Chen Qian et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Llmseceval: A Dataset Of Natural Language Prompts For Security Evaluations Catherine Tony, Markus Mutas, Nicolás E. Díaz Ferreyra, Riccardo Scandariato
- Receive, Reason, And React: Drive As You Say With Large Language Models In Autonomous Vehicles Can Cui, Yunsheng Ma, Xu Cao, Wenqian Ye, Ziran Wang
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- LLM+P: Empowering Large Language Models With Optimal Planning Proficiency Bo Liu et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Evaluation Of Chatgpt For Nlp-based Mental Health Applications Bishal Lamichhane
- Swiftsage: A Generative Agent With Fast And Slow Thinking For Complex Interactive Tasks Bill Yuchen Lin et al.
- ART: Automatic Multi-step Reasoning And Tool-use For Large Language Models Bhargavi Paranjape et al.
- Facilitating Self-guided Mental Health Interventions Through Human-language Model Interaction: A Case Study Of Cognitive Restructuring Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, Theresa Nguyen, Tim Althoff
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Chemcrow: Augmenting Large-language Models With Chemistry Tools Andres M Bran et al.
- On Generative Agents In Recommendation An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-seng Chua
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- Robots That Ask For Help: Uncertainty Alignment For Large Language Model Planners Allen Z. Ren et al.
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Chatgpt: More Than A Weapon Of Mass Deception, Ethical Challenges And Responses From The Human-centered Artificial Intelligence (HCAI) Perspective Alejo Jose G. Sison, Marco Tulio Daza, Roberto Gozalo-brizuela, Eduardo C. Garrido-merchán
- Self-rag: Learning To Retrieve, Generate, And Critique Through Self-reflection Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, Hannaneh Hajishirzi
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Can Chatgpt And Bard Generate Aligned Assessment Items? A Reliability Analysis Against Human Performance Abdolvahab Khademi
- Conversational Ai-powered Design: Chatgpt As Designer, User, And Product A. Baki Kocaballi
- Better To Ask In English: Cross-lingual Evaluation Of Large Language Models For Healthcare Queries Yiqiao Jin et al.
- Enhancing Job Recommendation Through Llm-based Generative Adversarial Networks Yingpeng Du et al.
- How Far Can Camels Go? Exploring The State Of Instruction Tuning On Open Resources Yizhong Wang et al.
- Can Chatgpt Replace Traditional KBQA Models? An In-depth Analysis Of The Question Answering Performance Of The GPT LLM Family Yiming Tan et al.
- Improving Factuality And Reasoning In Language Models Through Multiagent Debate Yilun Du, Shuang Li, Antonio Torralba, Joshua B. Tenenbaum, Igor Mordatch
- Mindmap: Knowledge Graph Prompting Sparks Graph Of Thoughts In Large Language Models Yilin Wen, Zifeng Wang, Jimeng Sun
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- A Multitask, Multilingual, Multimodal Evaluation Of Chatgpt On Reasoning, Hallucination, And Interactivity Yejin Bang et al.
- Chatpose: Chatting About 3D Human Pose Yao Feng et al.
- Beyond Chain-of-thought, Effective Graph-of-thought Reasoning In Language Models Yao Yao, Zuchao Li, Hai Zhao
- The Dark Side Of Chatgpt: Legal And Ethical Challenges From Stochastic Parrots And Hallucination Zihao Li
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Llama-vid: An Image Is Worth 2 Tokens In Large Language Models Yanwei Li, Chengyao Wang, Jiaya Jia
- G-eval: NLG Evaluation Using GPT-4 With Better Human Alignment Yang Liu et al.
- Recmind: Large Language Model Powered Agent For Recommendation Yancheng Wang et al.
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Integrating Action Knowledge And Llms For Task Planning And Situation Handling In Open Worlds Yan Ding et al.
- Llavar: Enhanced Visual Instruction Tuning For Text-rich Image Understanding Yanzhe Zhang et al.
- Representation Learning With Large Language Models For Recommendation Xubin Ren et al.
- Can Chatgpt Pass The Vietnamese National High School Graduation Examination? Xuan-quy Dao, Ngoc-bich Le, Xuan-dung Phan, Bac-bien Ngo
- Performance Comparison Of Large Language Models On VNHSGE English Dataset: Openai Chatgpt, Microsoft Bing Chat, And Google Bard Xuan-quy Dao
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Chat With The Environment: Interactive Multimodal Perception Using Large Language Models Xufeng Zhao, Mengdi Li, Cornelius Weber, Muhammad Burhan Hafez, Stefan Wermter
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- Chat-rec: Towards Interactive And Explainable Llms-augmented Recommender System Yunfan Gao et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Contextual Object Detection With Multimodal Large Language Models Yuhang Zang, Wei Li, Jun Han, Kaiyang Zhou, Chen Change Loy
- Chatgraph: Interpretable Text Classification By Converting Chatgpt Knowledge To Graphs Yucheng Shi et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Fundamental Limitations Of Alignment In Large Language Models Yotam Wolf, Noam Wies, Oshri Avnery, Yoav Levine, Amnon Shashua
- Gpt4aigchip: Towards Next-generation AI Accelerator Design Automation Via Large Language Models Yonggan Fu et al.
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- C-eval: A Multi-level Multi-discipline Chinese Evaluation Suite For Foundation Models Yuzhen Huang et al.
- Hard Prompts Made Easy: Gradient-based Discrete Optimization For Prompt Tuning And Discovery Yuxin Wen et al.
- Copiloting The Copilots: Fusing Large Language Models With Completion Engines For Automated Program Repair Yuxiang Wei, Chunqiu Steven Xia, Lingming Zhang
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Let The Llms Talk: Simulating Human-to-human Conversational QA Via Zero-shot Llm-to-llm Interactions Zahra Abbasiantaeb, Yifei Yuan, Evangelos Kanoulas, Mohammad Aliannejadi
- Batch Prompting: Efficient Inference With Large Language Model Apis Zhoujun Cheng, Jungo Kasai, Tao Yu
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- "do Anything Now": Characterizing And Evaluating In-the-wild Jailbreak Prompts On Large Language Models Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang
- Mitigating Large Language Model Hallucinations Via Autonomous Knowledge Graph-based Retrofitting Xinyan Guan et al.
- Query Rewriting For Retrieval-augmented Large Language Models Xinbei Ma, Yeyun Gong, Pengcheng He, Hai Zhao, Nan Duan
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- Chatbot Arena: An Open Platform For Evaluating Llms By Human Preference Wei-lin Chiang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Towards Conversational Diagnostic AI Tao Tu et al.
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- Beyond Code Generation: An Observational Study Of Chatgpt Usage In Software Engineering Practice Ranim Khojah, Mazen Mohamad, Philipp Leitner, Francisco Gomes De Oliveira Neto
- A Systematic Survey Of Prompt Engineering In Large Language Models: Techniques And Applications Pranab Sahoo et al.
- SNIFFER: Multimodal Large Language Model For Explainable Out-of-context Misinformation Detection Peng Qi, Zehong Yan, Wynne Hsu, Mong Li Lee
- Iris: An Ai-driven Virtual Tutor For Computer Science Education Patrick Bassner, Eduard Frankford, Stephan Krusche
- Shaping Human-ai Collaboration: Varied Scaffolding Levels In Co-writing With Language Models Paramveer S. Dhillon et al.
- Ai-augmented Brainwriting: Investigating The Use Of Llms In Group Ideation Orit Shaer, Angelora Cooper, Osnat Mokryn, Andrew L. Kun, Hagit Ben Shoshan
- Same Task, More Tokens: The Impact Of Input Length On The Reasoning Performance Of Large Language Models Mosh Levy, Alon Jacoby, Yoav Goldberg
- A Piece Of Theatre: Investigating How Teachers Design LLM Chatbots To Assist Adolescent Cyberbullying Education Michael A. Hedderich et al.
- Large Legal Fictions: Profiling Legal Hallucinations In Large Language Models Matthew Dahl, Varun Magesh, Mirac Suzgun, Daniel E. Ho
- A Review Of Large Language Models And Autonomous Agents In Chemistry Mayk Caldas Ramos, Christopher J. Collison, Andrew D. White
- Codeaid: Evaluating A Classroom Deployment Of An Llm-based Programming Assistant That Balances Student And Educator Needs Majeed Kazemitabaar et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- (A)I Am Not A Lawyer, But...: Engaging Legal Experts Towards Responsible LLM Policies For Legal Advice Inyoung Cheong, King Xia, K. J. Kevin Feng, Quan Ze Chen, Amy X. Zhang
- Materials Science In The Era Of Large Language Models: A Perspective Ge Lei, Ronan Docherty, Samuel J. Cooper
- Ai-tutoring In Software Engineering Education Eduard Frankford, Clemens Sauerwein, Patrick Bassner, Stephan Krusche, Ruth Breu
- Chemllm: A Chemical Large Language Model Di Zhang et al.
- Deepseek-coder: When The Large Language Model Meets Programming -- The Rise Of Code Intelligence Daya Guo et al.
- Generative AI In EU Law: Liability, Privacy, Intellectual Property, And Cybersecurity Claudio Novelli, Federico Casolari, Philipp Hacker, Giorgio Spedicato, Luciano Floridi
- Homogenization Effects Of Large Language Models On Human Creative Ideation Barrett R. Anderson, Jash Hemant Shah, Max Kreminski
- Taking The Next Step With Generative Artificial Intelligence: The Transformative Role Of Multimodal Large Language Models In Science Education Arne Bewersdorff et al.
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Survey On Large Language Model-enhanced Reinforcement Learning: Concept, Taxonomy, And Methods Yuji Cao et al.
- Large Language Models In Mental Health Care: A Scoping Review Yining Hua et al.
- Understanding Biases In Chatgpt-based Recommender Systems: Provider Fairness, Temporal Stability, And Recency Yashar Deldjoo
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Prompting Large Language Models With Rationale Heuristics For Knowledge-based Visual Question Answering Zhongjian Hu, Peng Yang, Bing Li, Fengyuan Liu
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Quality Of Answers Of Generative Large Language Models Vs Peer Patients For Interpreting Lab Test Results For Lay Patients: Evaluation Study Zhe He et al.
- Measurement Of Llm's Philosophies Of Human Nature Minheng Ni et al.
🏷 Training Techniques
- Sequence-to-sequence Learning As Beam-search Optimization Sam Wiseman, Alexander M. Rush
- Programming With A Differentiable Forth Interpreter Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel
- An Actor-critic Algorithm For Sequence Prediction Dzmitry Bahdanau et al.
- Neural Response Generation With Dynamic Vocabularies Yu Wu et al.
- A Unified Query-based Generative Model For Question Generation And Question Answering Linfeng Song, Zhiguo Wang, Wael Hamza
- Data Distillation For Controlling Specificity In Dialogue Generation Jiwei Li, Will Monroe, Dan Jurafsky
- Fine Grained Knowledge Transfer For Personalized Task-oriented Dialogue Systems Kaixiang Mo, Yu Zhang, Qiang Yang, Pascale Fung
- Steering Output Style And Topic In Neural Response Generation Di Wang, Nebojsa Jojic, Chris Brockett, Eric Nyberg
- Attention Is All You Need Ashish Vaswani et al.
- Adversarial Learning For Neural Dialogue Generation Jiwei Li et al.
- Long Text Generation Via Adversarial Training With Leaked Information Jiaxian Guo et al.
- Mojitalk: Generating Emotional Responses At Scale Xianda Zhou, William Yang Wang
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Parlai: A Dialog Research Software Platform Alexander H. Miller et al.
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Another Diversity-promoting Objective Function For Neural Dialogue Generation Ryo Nakamura, Katsuhito Sudoh, Koichiro Yoshino, Satoshi Nakamura
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Can You Tell Me How To Get Past Sesame Street? Sentence-level Pretraining Beyond Language Modeling Alex Wang et al.
- Language Gans Falling Short Massimo Caccia et al.
- Fast Abstractive Summarization With Reinforce-selected Sentence Rewriting Yen-chun Chen, Mohit Bansal
- Adversarial Over-sensitivity And Over-stability Strategies For Dialogue Models Tong Niu, Mohit Bansal
- Generating Informative And Diverse Conversational Responses Via Adversarial Information Maximization Yizhe Zhang et al.
- A Retrieve-and-edit Framework For Predicting Structured Outputs Tatsunori B. Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang
- Retrieval-enhanced Adversarial Training For Neural Response Generation Qingfu Zhu, Lei Cui, Weinan Zhang, Furu Wei, Ting Liu
- Dialogue Generation: From Imitation Learning To Inverse Reinforcement Learning Ziming Li, Julia Kiseleva, Maarten De Rijke
- Polite Dialogue Generation Without Parallel Data Tong Niu, Mohit Bansal
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Skeleton-to-response: Dialogue Generation Guided By Retrieval Memory Deng Cai et al.
- Attention-guided Answer Distillation For Machine Reading Comprehension Minghao Hu et al.
- Toward Diverse Text Generation With Inverse Reinforcement Learning Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang
- Learn To Code-switch: Data Augmentation Using Copy Mechanism On Language Modeling Genta Indra Winata, Andrea Madotto, Chien-sheng Wu, Pascale Fung
- Extending Neural Generative Conversational Model Using External Knowledge Sources Prasanna Parthasarathi, Joelle Pineau
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- Sentence Encoders On Stilts: Supplementary Training On Intermediate Labeled-data Tasks Jason Phang, Thibault Févry, Samuel R. Bowman
- Adversarially Regularising Neural NLI Models To Integrate Logical Background Knowledge Pasquale Minervini, Sebastian Riedel
- Simple Fusion: Return Of The Language Model Felix Stahlberg, James Cross, Veselin Stoyanov
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Emrqa: A Large Corpus For Question Answering On Electronic Medical Records Anusri Pampari, Preethi Raghavan, Jennifer Liang, Jian Peng
- Maskgan: Better Text Generation Via Filling In The______ William Fedus, Ian Goodfellow, Andrew M. Dai
- Training Millions Of Personalized Dialogue Agents Pierre-emmanuel Mazaré, Samuel Humeau, Martin Raison, Antoine Bordes
- Efficient Contextualized Representation: Language Model Pruning For Sequence Labeling Liyuan Liu, Xiang Ren, Jingbo Shang, Jian Peng, Jiawei Han
- Improving Machine Reading Comprehension With General Reading Strategies Kai Sun, Dian Yu, Dong Yu, Claire Cardie
- Structured Pruning Of Large Language Models Ziheng Wang, Jeremy Wohlwend, Tao Lei
- Multi-passage BERT: A Globally Normalized BERT Model For Open-domain Question Answering Zhiguo Wang, Patrick Ng, Xiaofei Ma, Ramesh Nallapati, Bing Xiang
- Review Conversational Reading Comprehension Hu Xu, Bing Liu, Lei Shu, Philip S. Yu
- Roberta: A Robustly Optimized BERT Pretraining Approach Yinhan Liu et al.
- Well-read Students Learn Better: On The Importance Of Pre-training Compact Models Iulia Turc, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Probing Natural Language Inference Models Through Semantic Fragments Kyle Richardson, Hai Hu, Lawrence S. Moss, Ashish Sabharwal
- Generalization In Generation: A Closer Look At Exposure Bias Florian Schmidt
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Unsupervised Question Answering By Cloze Translation Patrick Lewis, Ludovic Denoyer, Sebastian Riedel
- Countering Language Drift Via Visual Grounding Jason Lee, Kyunghyun Cho, Douwe Kiela
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Olmpics -- On What Language Model Pre-training Captures Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- Multiqa: An Empirical Investigation Of Generalization And Transfer In Reading Comprehension Alon Talmor, Jonathan Berant
- Harnessing Evolution Of Multi-turn Conversations For Effective Answer Retrieval Mohammad Aliannejadi, Manajit Chakraborty, Esteban Andrés Ríssola, Fabio Crestani
- A Pre-training Based Personalized Dialogue Generation Model With Persona-sparse Data Yinhe Zheng, Rongsheng Zhang, Xiaoxi Mao, Minlie Huang
- MASS: Masked Sequence To Sequence Pre-training For Language Generation Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-yan Liu
- Enabling Robots To Understand Incomplete Natural Language Instructions Using Commonsense Reasoning Haonan Chen, Hao Tan, Alan Kuntz, Mohit Bansal, Ron Alterovitz
- ALBERT: A Lite BERT For Self-supervised Learning Of Language Representations Zhenzhong Lan et al.
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Ensemble-based Deep Reinforcement Learning For Chatbots Heriberto Cuayáhuitl et al.
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- The Curious Case Of Neural Text Degeneration Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi
- Structbert: Incorporating Language Structures Into Pre-training For Deep Language Understanding Wei Wang et al.
- Mixture Content Selection For Diverse Sequence Generation Jaemin Cho, Minjoon Seo, Hannaneh Hajishirzi
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Training Neural Response Selection For Task-oriented Dialogue Systems Matthew Henderson et al.
- Robust Navigation With Language Pretraining And Stochastic Sampling Xiujun Li et al.
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Multifit: Efficient Multi-lingual Language Model Fine-tuning Julian Martin Eisenschlos et al.
- Barack's Wife Hillary: Using Knowledge-graphs For Fact-aware Language Modeling Robert L. Iv Logan, Nelson F. Liu, Matthew E. Peters, Matt Gardner, Sameer Singh
- Pretrained Language Models For Document-level Neural Machine Translation Liangyou Li, Xin Jiang, Qun Liu
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- GLTR: Statistical Detection And Visualization Of Generated Text Sebastian Gehrmann, Hendrik Strobelt, Alexander M. Rush
- Language Models As Knowledge Bases? Fabio Petroni et al.
- Camembert: A Tasty French Language Model Louis Martin et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Model Compression With Two-stage Multi-teacher Knowledge Distillation For Web Question Answering System Ze Yang, Linjun Shou, Ming Gong, Wutao Lin, Daxin Jiang
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- Hello, It's GPT-2 -- How Can I Help You? Towards The Use Of Pretrained Language Models For Task-oriented Dialogue Systems Paweł Budzianowski, Ivan Vulić
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Few-shot NLG With Pre-trained Language Model Zhiyu Chen, Harini Eavani, Wenhu Chen, Yinyin Liu, William Yang Wang
- Non-monotonic Sequential Text Generation Sean Welleck, Kianté Brantley, Hal Iii Daumé, Kyunghyun Cho
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- Pretrained Encyclopedia: Weakly Supervised Knowledge-pretrained Language Model Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
- UER: An Open-source Toolkit For Pre-training Models Zhe Zhao et al.
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- Reweighted Proximal Pruning For Large-scale Language Representation Fu-ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Distilbert, A Distilled Version Of BERT: Smaller, Faster, Cheaper And Lighter Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf
- Sticking To The Facts: Confident Decoding For Faithful Data-to-text Generation Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- Attention-informed Mixed-language Training For Zero-shot Cross-lingual Task-oriented Dialogue Systems Zihan Liu, Genta Indra Winata, Zhaojiang Lin, Peng Xu, Pascale Fung
- Retrieve, Read, Rerank: Towards End-to-end Multi-document Reading Comprehension Minghao Hu, Yuxing Peng, Zhen Huang, Dongsheng Li
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Cross-lingual Natural Language Generation Via Pre-training Zewen Chi et al.
- Learning From Dialogue After Deployment: Feed Yourself, Chatbot! Braden Hancock, Antoine Bordes, Pierre-emmanuel Mazaré, Jason Weston
- Explain Yourself! Leveraging Language Models For Commonsense Reasoning Nazneen Fatema Rajani, Bryan Mccann, Caiming Xiong, Richard Socher
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Towards Transfer Learning For End-to-end Speech Synthesis From Deep Pre-trained Language Models Wei Fang, Yu-an Chung, James Glass
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- Unicoder: A Universal Language Encoder By Pre-training With Multiple Cross-lingual Tasks Haoyang Huang et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Build It Break It Fix It For Dialogue Safety: Robustness From Adversarial Human Attack Emily Dinan, Samuel Humeau, Bharath Chintagunta, Jason Weston
- Self-attentive Model For Headline Generation Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- Parallel Scheduled Sampling Daniel Duckworth, Arvind Neelakantan, Ben Goodrich, Lukasz Kaiser, Samy Bengio
- Fine-tuning Language Models From Human Preferences Daniel M. Ziegler et al.
- Fairseq: A Fast, Extensible Toolkit For Sequence Modeling Myle Ott et al.
- Automatic Spanish Translation Of The Squad Dataset For Multilingual Question Answering Casimiro Pio Carrino, Marta R. Costa-jussà, José A. R. Fonollosa
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Patent Claim Generation By Fine-tuning Openai GPT-2 Jieh-sheng Lee, Jieh Hsiang
- Multi-hop Question Answering Via Reasoning Chains Jifan Chen, Shih-ting Lin, Greg Durrett
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- Semantics-aware BERT For Language Understanding Zhuosheng Zhang et al.
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Learning And Evaluating General Linguistic Intelligence Dani Yogatama et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- A Simple But Effective Method To Incorporate Multi-turn Context With BERT For Conversational Machine Comprehension Yasuhito Ohsugi, Itsumi Saito, Kyosuke Nishida, Hisako Asano, Junji Tomita
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Inducing Brain-relevant Bias In Natural Language Processing Models Dan Schwartz, Mariya Toneva, Leila Wehbe
- Don't Say That! Making Inconsistent Dialogue Unlikely With Unlikelihood Training Margaret Li et al.
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Evaluating Commonsense In Pre-trained Language Models Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- End-to-end Bias Mitigation By Modelling Biases In Corpora Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- What Makes A Good Conversation? How Controllable Attributes Affect Human Judgments Abigail See, Stephen Roller, Douwe Kiela, Jason Weston
- QASC: A Dataset For Question Answering Via Sentence Composition Tushar Khot, Peter Clark, Michal Guerquin, Peter Jansen, Ashish Sabharwal
- Nemo: A Toolkit For Building AI Applications Using Neural Modules Oleksii Kuchaiev et al.
- Towards Scalable Multi-domain Conversational Agents: The Schema-guided Dialogue Dataset Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Consistent Dialogue Generation With Self-supervised Feature Learning Yizhe Zhang et al.
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- LAMOL: Language Modeling For Lifelong Language Learning Fan-keng Sun, Cheng-hao Ho, Hung-yi Lee
- Winogrande: An Adversarial Winograd Schema Challenge At Scale Keisuke Sakaguchi, Ronan Le Bras, Chandra Bhagavatula, Yejin Choi
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Data Augmentation For BERT Fine-tuning In Open-domain Question Answering Wei Yang et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Visualizing And Understanding The Effectiveness Of BERT Yaru Hao, Li Dong, Furu Wei, Ke Xu
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Juice: A Large Scale Distantly Supervised Dataset For Open Domain Context-based Code Generation Rajas Agashe, Srinivasan Iyer, Luke Zettlemoyer
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- What Does BERT Learn From Multiple-choice Reading Comprehension Datasets? Chenglei Si, Shuohang Wang, Min-yen Kan, Jing Jiang
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Gmail Smart Compose: Real-time Assisted Writing Mia Xu Chen et al.
- 12-in-1: Multi-task Vision And Language Representation Learning Jiasen Lu, Vedanuj Goswami, Marcus Rohrbach, Devi Parikh, Stefan Lee
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Zero: Memory Optimizations Toward Training Trillion Parameter Models Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He
- BART: Denoising Sequence-to-sequence Pre-training For Natural Language Generation, Translation, And Comprehension Mike Lewis et al.
- Paraphrasing With Large Language Models Sam Witteveen, Martin Andrews
- Learning To Select Knowledge For Response Generation In Dialog Systems Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu
- Codegru: Context-aware Deep Learning With Gated Recurrent Unit For Source Code Modeling Yasir Hussain, Zhiqiu Huang, Yu Zhou, Senzhang Wang
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Modifying Memories In Transformer Models Chen Zhu et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Bert-of-theseus: Compressing BERT By Progressive Module Replacing Canwen Xu, Wangchunshu Zhou, Tao Ge, Furu Wei, Ming Zhou
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- UNIMO: Towards Unified-modal Understanding And Generation Via Cross-modal Contrastive Learning Wei Li et al.
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Unifiedqa: Crossing Format Boundaries With A Single QA System Daniel Khashabi et al.
- KGPT: Knowledge-grounded Pre-training For Data-to-text Generation Wenhu Chen, Yu Su, Xifeng Yan, William Yang Wang
- Reducing Gender Bias In Neural Machine Translation As A Domain Adaptation Problem Danielle Saunders, Bill Byrne
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Inducing Language-agnostic Multilingual Representations Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- REALM: Retrieval-augmented Language Model Pre-training Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-wei Chang
- ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators Kevin Clark, Minh-thang Luong, Quoc V. Le, Christopher D. Manning
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- M3P: Learning Universal Representations Via Multitask Multilingual Multimodal Pre-training Minheng Ni et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- Few-shot Generative Conversational Query Rewriting Shi Yu et al.
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Modelling Hierarchical Structure Between Dialogue Policy And Natural Language Generator With Option Framework For Task-oriented Dialogue System Jianhong Wang, Yuan Zhang, Tae-kyun Kim, Yunjie Gu
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Controlled Hallucinations: Learning To Generate Faithfully From Noisy Data Katja Filippova
- XGLUE: A New Benchmark Dataset For Cross-lingual Pre-training, Understanding And Generation Yaobo Liang et al.
- Realtoxicityprompts: Evaluating Neural Toxic Degeneration In Language Models Samuel Gehman, Suchin Gururangan, Maarten Sap, Yejin Choi, Noah A. Smith
- Measuring And Reducing Gendered Correlations In Pre-trained Models Kellie Webster et al.
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Multilingual Translation With Extensible Multilingual Pretraining And Finetuning Yuqing Tang et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- A Knowledge-enhanced Pretraining Model For Commonsense Story Generation Jian Guan, Fei Huang, Zhihao Zhao, Xiaoyan Zhu, Minlie Huang
- WT5?! Training Text-to-text Models To Explain Their Predictions Sharan Narang et al.
- TOD-BERT: Pre-trained Natural Language Understanding For Task-oriented Dialogue Chien-sheng Wu, Steven Hoi, Richard Socher, Caiming Xiong
- Optimus: Organizing Sentences Via Pre-trained Modeling Of A Latent Space Chunyuan Li et al.
- Knowledge Distillation For Improved Accuracy In Spoken Question Answering Chenyu You, Nuo Chen, Yuexian Zou
- Mathematical Reasoning Via Self-supervised Skip-tree Training Markus N. Rabe, Dennis Lee, Kshitij Bansal, Christian Szegedy
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- UBAR: Towards Fully End-to-end Task-oriented Dialog Systems With GPT-2 Yunyi Yang, Yunhao Li, Xiaojun Quan
- ERNIE-GEN: An Enhanced Multi-flow Pre-training And Fine-tuning Framework For Natural Language Generation Dongling Xiao et al.
- CLEAR: Contrastive Learning For Sentence Representation Zhuofeng Wu et al.
- Dialoglue: A Natural Language Understanding Benchmark For Task-oriented Dialogue Shikib Mehri, Mihail Eric, Dilek Hakkani-tur
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Autoprompt: Eliciting Knowledge From Language Models With Automatically Generated Prompts Taylor Shin, Yasaman Razeghi, Robert L. Iv Logan, Eric Wallace, Sameer Singh
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Language Models Are Few-shot Learners Tom B. Brown et al.
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Coreferential Reasoning Learning For Language Representation Deming Ye et al.
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- Learning To Recombine And Resample Data For Compositional Generalization Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
- Adversarial Training For Large Neural Language Models Xiaodong Liu et al.
- Injecting Numerical Reasoning Skills Into Language Models Mor Geva, Ankit Gupta, Jonathan Berant
- Alfworld: Aligning Text And Embodied Environments For Interactive Learning Mohit Shridhar et al.
- Intermediate-task Transfer Learning With Pretrained Models For Natural Language Understanding: When And Why Does It Work? Yada Pruksachatkun et al.
- Residual Energy-based Models For Text Generation Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'aurelio Ranzato
- Mintl: Minimalist Transfer Learning For Task-oriented Dialogue Systems Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Pascale Fung
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- Fine-tuning Pretrained Language Models: Weight Initializations, Data Orders, And Early Stopping Jesse Dodge et al.
- How Can We Know When Language Models Know? On The Calibration Of Language Models For Question Answering Zhengbao Jiang, Jun Araki, Haibo Ding, Graham Neubig
- Unsupervised Evaluation Of Interactive Dialog With Dialogpt Shikib Mehri, Maxine Eskenazi
- Fine-tuning Pre-trained Language Model With Weak Supervision: A Contrastive-regularized Self-training Approach Yue Yu et al.
- Generative Data Augmentation For Commonsense Reasoning Yiben Yang et al.
- Recipes For Safety In Open-domain Chatbots Jing Xu et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- A Large-scale Chinese Short-text Conversation Dataset Yida Wang et al.
- XGPT: Cross-modal Generative Pre-training For Image Captioning Qiaolin Xia et al.
- Better Robustness By More Coverage: Adversarial Training With Mixup Augmentation For Robust Fine-tuning Chenglei Si et al.
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- When Being Unseen From Mbert Is Just The Beginning: Handling New Languages With Multilingual Language Models Benjamin Muller, Antonis Anastasopoulos, Benoît Sagot, Djamé Seddah
- Recall And Learn: Fine-tuning Deep Pretrained Language Models With Less Forgetting Sanyuan Chen et al.
- Gedi: Generative Discriminator Guided Sequence Generation Ben Krause et al.
- Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies For Multi-turn Response Selection Taesun Whang et al.
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- CPM: A Large-scale Generative Chinese Pre-trained Language Model Zhengyan Zhang et al.
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- Mixkd: Towards Efficient Distillation Of Large-scale Language Models Kevin J Liang et al.
- Exploring And Predicting Transferability Across NLP Tasks Tu Vu et al.
- Countering Language Drift With Seeded Iterated Learning Yuchen Lu, Soumye Singhal, Florian Strub, Olivier Pietquin, Aaron Courville
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Better Fine-tuning By Reducing Representational Collapse Armen Aghajanyan et al.
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Text Generation By Learning From Demonstrations Richard Yuanzhe Pang, He He
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- Knowledge-driven Data Construction For Zero-shot Evaluation In Commonsense Question Answering Kaixin Ma et al.
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Beyond English-centric Multilingual Machine Translation Angela Fan et al.
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Multilingual Denoising Pre-training For Neural Machine Translation Yinhan Liu et al.
- Language Models As Few-shot Learner For Task-oriented Dialogue Systems Andrea Madotto, Zihan Liu, Zhaojiang Lin, Pascale Fung
- Explaining Question Answering Models Through Text Generation Veronica Latcinnik, Jonathan Berant
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- Question And Answer Test-train Overlap In Open-domain Question Answering Datasets Patrick Lewis, Pontus Stenetorp, Sebastian Riedel
- Contrastive Code Representation Learning Paras Jain et al.
- What Happens To BERT Embeddings During Fine-tuning? Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- How Good Is Your Tokenizer? On The Monolingual Performance Of Multilingual Language Models Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
- Facts As Experts: Adaptable And Interpretable Neural Memory Over Symbolic Knowledge Pat Verga, Haitian Sun, Livio Baldini Soares, William W. Cohen
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Are We Pretraining It Right? Digging Deeper Into Visio-linguistic Pretraining Amanpreet Singh, Vedanuj Goswami, Devi Parikh
- Retrieval-augmented Generation For Knowledge-intensive NLP Tasks Patrick Lewis et al.
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- GOBO: Quantizing Attention-based NLP Models For Low Latency And Energy Efficient Inference Ali Hadi Zadeh, Isak Edo, Omar Mohamed Awad, Andreas Moshovos
- Template-based Question Generation From Retrieved Sentences For Improved Unsupervised Question Answering Alexander R. Fabbri, Patrick Ng, Zhiguo Wang, Ramesh Nallapati, Bing Xiang
- Nearest Neighbor Machine Translation Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- ABNIRML: Analyzing The Behavior Of Neural IR Models Sean Macavaney, Sergey Feldman, Nazli Goharian, Doug Downey, Arman Cohan
- Exploring Fine-tuning Techniques For Pre-trained Cross-lingual Models Via Continual Learning Zihan Liu, Genta Indra Winata, Andrea Madotto, Pascale Fung
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- Logic-guided Data Augmentation And Regularization For Consistent Question Answering Akari Asai, Hannaneh Hajishirzi
- Exploring Versatile Generative Language Model Via Parameter-efficient Transfer Learning Zhaojiang Lin, Andrea Madotto, Pascale Fung
- Beyond I.I.D.: Three Levels Of Generalization For Question Answering On Knowledge Bases Yu Gu et al.
- Tabert: Pretraining For Joint Understanding Of Textual And Tabular Data Pengcheng Yin, Graham Neubig, Wen-tau Yih, Sebastian Riedel
- TAP: Text-aware Pre-training For Text-vqa And Text-caption Zhengyuan Yang et al.
- Schema-guided Dialogue State Tracking Task At DSTC8 Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan
- BANG: Bridging Autoregressive And Non-autoregressive Generation With Large Scale Pretraining Weizhen Qi et al.
- How Much Knowledge Can You Pack Into The Parameters Of A Language Model? Adam Roberts, Colin Raffel, Noam Shazeer
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- It's Not Just Size That Matters: Small Language Models Are Also Few-shot Learners Timo Schick, Hinrich Schütze
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Zero-resource Knowledge-grounded Dialogue Generation Linxiao Li et al.
- ECONET: Effective Continual Pretraining Of Language Models For Event Temporal Reasoning Rujun Han, Xiang Ren, Nanyun Peng
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Logic2text: High-fidelity Natural Language Generation From Logical Forms Zhiyu Chen et al.
- Incorporating External Knowledge Through Pre-training For Natural Language To Code Generation Frank F. Xu, Zhengbao Jiang, Pengcheng Yin, Bogdan Vasilescu, Graham Neubig
- Human Instruction-following With Deep Reinforcement Learning Via Transfer-learning From Text Felix Hill, Sona Mokra, Nathaniel Wong, Tim Harley
- Grounded Language Learning Fast And Slow Felix Hill et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Multi-modal Open-domain Dialogue Kurt Shuster, Eric Michael Smith, Da Ju, Jason Weston
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Will I Sound Like Me? Improving Persona Consistency In Dialogues Through Pragmatic Self-consciousness Hyunwoo Kim, Byeongchang Kim, Gunhee Kim
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Low-resource Knowledge-grounded Dialogue Generation Xueliang Zhao et al.
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Pre-training Via Paraphrasing Mike Lewis et al.
- Training Question Answering Models From Synthetic Data Raul Puri, Ryan Spring, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
- Investigating Pretrained Language Models For Graph-to-text Generation Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, Iryna Gurevych
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation Ruibo Liu et al.
- Making Pre-trained Language Models Better Few-shot Learners Tianyu Gao, Adam Fisch, Danqi Chen
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Scaling Laws For Neural Language Models Jared Kaplan et al.
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- Charbert: Character-aware Pre-trained Language Model Wentao Ma et al.
- DAVE: Deriving Automatically Verilog From English Hammond Pearce, Benjamin Tan, Ramesh Karri
- Cosda-ml: Multi-lingual Code-switching Data Augmentation For Zero-shot Cross-lingual NLP Libo Qin, Minheng Ni, Yue Zhang, Wanxiang Che
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- The Pile: An 800GB Dataset Of Diverse Text For Language Modeling Leo Gao et al.
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- Vokenization: Improving Language Understanding With Contextualized, Visual-grounded Supervision Hao Tan, Mohit Bansal
- Recipes For Building An Open-domain Chatbot Stephen Roller et al.
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- LRC-BERT: Latent-representation Contrastive Knowledge Distillation For Natural Language Understanding Hao Fu et al.
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Contrastive Distillation On Intermediate Representations For Language Model Compression Siqi Sun et al.
- PLATO-2: Towards Building An Open-domain Chatbot Via Curriculum Learning Siqi Bao et al.
- Few-shot Text Generation With Pattern-exploiting Training Timo Schick, Hinrich Schütze
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- Imitation Attacks And Defenses For Black-box Machine Translation Systems Eric Wallace, Mitchell Stern, Dawn Song
- Can You Put It All Together: Evaluating Conversational Agents' Ability To Blend Skills Eric Michael Smith, Mary Williamson, Kurt Shuster, Jason Weston, Y-lan Boureau
- Robust Encodings: A Framework For Combating Adversarial Typos Erik Jones, Robin Jia, Aditi Raghunathan, Percy Liang
- To Pretrain Or Not To Pretrain: Examining The Benefits Of Pretraining On Resource Rich Tasks Sinong Wang, Madian Khabsa, Hao Ma
- Controlling Style In Generated Dialogue Eric Michael Smith, Diana Gonzalez-rico, Emily Dinan, Y-lan Boureau
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Data Manipulation: Towards Effective Instance Learning For Neural Dialogue Generation Via Learning To Augment And Reweight Hengyi Cai et al.
- Language Generation With Multi-hop Reasoning On Commonsense Knowledge Graph Haozhe Ji et al.
- Multilingual Speech Translation With Efficient Finetuning Of Pretrained Models Xian Li et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- EVA: An Open-domain Chinese Dialogue System With Large-scale Generative Pre-training Hao Zhou et al.
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Indonlg: Benchmark And Resources For Evaluating Indonesian Natural Language Generation Samuel Cahyawijaya et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- Fewshotqa: A Simple Framework For Few-shot Learning Of Question Answering Tasks Using Pre-trained Text-to-text Models Rakesh Chada, Pradeep Natarajan
- Bias Out-of-the-box: An Empirical Analysis Of Intersectional Occupational Biases In Popular Generative Language Models Hannah Kirk et al.
- Increasing Faithfulness In Knowledge-grounded Dialogue With Controllable Features Hannah Rashkin, David Reitter, Gaurav Singh Tomar, Dipanjan Das
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- Redditbias: A Real-world Resource For Bias Evaluation And Debiasing Of Conversational Language Models Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Truthfulqa: Measuring How Models Mimic Human Falsehoods Stephanie Lin, Jacob Hilton, Owain Evans
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- How Should Pre-trained Language Models Be Fine-tuned Towards Adversarial Robustness? Xinhsuai Dong, Luu Anh Tuan, Min Lin, Shuicheng Yan, Hanwang Zhang
- Learning How To Ask: Querying Lms With Mixtures Of Soft Prompts Guanghui Qin, Jason Eisner
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Wenlan: Bridging Vision And Language By Large-scale Multi-modal Pre-training Yuqi Huo et al.
- Is GPT-3 Text Indistinguishable From Human Text? Scarecrow: A Framework For Scrutinizing Machine Text Yao Dou, Maxwell Forbes, Rik Koncel-kedziorski, Noah A. Smith, Yejin Choi
- Mitigating Political Bias In Language Models Through Reinforced Calibration Ruibo Liu et al.
- Grounded Language-image Pre-training Liunian Harold Li et al.
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Efficient Large Scale Language Modeling With Mixtures Of Experts Mikel Artetxe et al.
- XLM-E: Cross-lingual Language Model Pre-training Via ELECTRA Zewen Chi et al.
- Codified Audio Language Modeling Learns Useful Representations For Music Information Retrieval Rodrigo Castellon, Chris Donahue, Percy Liang
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- True Few-shot Learning With Language Models Ethan Perez, Douwe Kiela, Kyunghyun Cho
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Fast Model Editing At Scale Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
- Process For Adapting Language Models To Society (PALMS) With Values-targeted Datasets Irene Openai Solaiman, Christy Openai Dennison
- Deltalm: Encoder-decoder Pre-training For Language Generation And Translation By Augmenting Pretrained Multilingual Encoders Shuming Ma et al.
- Thank You BART! Rewarding Pre-trained Models Improves Formality Style Transfer Huiyuan Lai, Antonio Toral, Malvina Nissim
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- Raise A Child In Large Language Model: Towards Effective And Generalizable Fine-tuning Runxin Xu et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- Rethink Training Of BERT Rerankers In Multi-stage Retrieval Pipeline Luyu Gao, Zhuyun Dai, Jamie Callan
- Structural Adapters In Pretrained Language Models For Amr-to-text Generation Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych
- Bob: BERT Over BERT For Training Persona-based Dialogue Models From Limited Personalized Data Haoyu Song, Yan Wang, Kaiyan Zhang, Wei-nan Zhang, Ting Liu
- On The Effectiveness Of Adapter-based Tuning For Pretrained Language Model Adaptation Ruidan He et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- Program Synthesis With Large Language Models Jacob Austin et al.
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- On Explaining Your Explanations Of BERT: An Empirical Study With Sequence Classification Zhengxuan Wu, Desmond C. Ong
- Retrieval Augmentation Reduces Hallucination In Conversation Kurt Shuster, Spencer Poff, Moya Chen, Douwe Kiela, Jason Weston
- Revisiting The Primacy Of English In Zero-shot Cross-lingual Transfer Iulia Turc, Kenton Lee, Jacob Eisenstein, Ming-wei Chang, Kristina Toutanova
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Cross-task Generalization Via Natural Language Crowdsourcing Instructions Swaroop Mishra, Daniel Khashabi, Chitta Baral, Hannaneh Hajishirzi
- ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Shuohuan Wang et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Metaicl: Learning To Learn In Context Sewon Min, Mike Lewis, Luke Zettlemoyer, Hannaneh Hajishirzi
- Societal Biases In Language Generation: Progress And Challenges Emily Sheng, Kai-wei Chang, Premkumar Natarajan, Nanyun Peng
- A Recipe For Arbitrary Text Style Transfer With Large Language Models Emily Reif et al.
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Unsupervised Corpus Aware Language Model Pre-training For Dense Passage Retrieval Luyu Gao, Jamie Callan
- All That's 'human' Is Not Gold: Evaluating Human Evaluation Of Generated Text Elizabeth Clark et al.
- Internet-augmented Dialogue Generation Mojtaba Komeili, Kurt Shuster, Jason Weston
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- Few-shot Learning With Multilingual Language Models Xi Victoria Lin et al.
- Unipelt: A Unified Framework For Parameter-efficient Language Model Tuning Yuning Mao et al.
- Meta-learning Via Language Model In-context Tuning Yanda Chen, Ruiqi Zhong, Sheng Zha, George Karypis, He He
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Compacter: Efficient Low-rank Hypercomplex Adapter Layers Rabeeh Karimi Mahabadi, James Henderson, Sebastian Ruder
- End-to-end Training Of Multi-document Reader And Retriever For Open-domain Question Answering Devendra Singh Sachan, Siva Reddy, William Hamilton, Chris Dyer, Dani Yogatama
- Improving And Simplifying Pattern Exploiting Training Derek Tam, Rakesh R Menon, Mohit Bansal, Shashank Srivastava, Colin Raffel
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- Pangu-\(α\): Large-scale Autoregressive Pretrained Chinese Language Models With Auto-parallel Computation Wei Zeng et al.
- Variational Information Bottleneck For Effective Low-resource Fine-tuning Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- CPM-2: Large-scale Cost-effective Pre-trained Language Models Zhengyan Zhang et al.
- Efficient Large-scale Language Model Training On GPU Clusters Using Megatron-lm Deepak Narayanan et al.
- Glam: Efficient Scaling Of Language Models With Mixture-of-experts Nan Du et al.
- On Transferability Of Prompt Tuning For Natural Language Processing Yusheng Su et al.
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Towards Few-shot Fact-checking Via Perplexity Nayeon Lee, Yejin Bang, Andrea Madotto, Madian Khabsa, Pascale Fung
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Zero-shot Recommendation As Language Modeling Damien Sileo, Wout Vossen, Robbe Raymaekers
- A Plug-and-play Method For Controlled Text Generation Damian Pascual, Beni Egressy, Clara Meister, Ryan Cotterell, Roger Wattenhofer
- Unified Pre-training For Program Understanding And Generation Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-wei Chang
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- Can Generative Pre-trained Language Models Serve As Knowledge Bases For Closed-book QA? Cunxiang Wang, Pai Liu, Yue Zhang
- The Stability-efficiency Dilemma: Investigating Sequence Length Warmup For Training GPT Models Conglong Li, Minjia Zhang, Yuxiong He
- Why Do Pretrained Language Models Help In Downstream Tasks? An Analysis Of Head And Prompt Tuning Colin Wei, Sang Michael Xie, Tengyu Ma
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- What To Pre-train On? Efficient Intermediate Task Selection Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- MAGMA -- Multimodal Augmentation Of Generative Models Through Adapter-based Finetuning Constantin Eichenberg, Sidney Black, Samuel Weinbach, Letitia Parcalabescu, Anette Frank
- Fantastically Ordered Prompts And Where To Find Them: Overcoming Few-shot Prompt Order Sensitivity Yao Lu, Max Bartolo, Alastair Moore, Sebastian Riedel, Pontus Stenetorp
- Counterfactual Memorization In Neural Language Models Chiyuan Zhang et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Structurallm: Structural Pre-training For Form Understanding Chenliang Li et al.
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- NSP-BERT: A Prompt-based Few-shot Learner Through An Original Pre-training Task--next Sentence Prediction Yi Sun, Yu Zheng, Chao Hao, Hangping Qiu
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Prefix-tuning: Optimizing Continuous Prompts For Generation Xiang Lisa Li, Percy Liang
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Scheduled Sampling In Vision-language Pretraining With Decoupled Encoder-decoder Network Yehao Li, Yingwei Pan, Ting Yao, Jingwen Chen, Tao Mei
- Calibrate Before Use: Improving Few-shot Performance Of Language Models Tony Z. Zhao, Eric Wallace, Shi Feng, Dan Klein, Sameer Singh
- Multitask Prompted Training Enables Zero-shot Task Generalization Victor Sanh et al.
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Generating Datasets With Pretrained Language Models Timo Schick, Hinrich Schütze
- Multimodal Dialogue Response Generation Qingfeng Sun et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- Adversarial GLUE: A Multi-task Benchmark For Robustness Evaluation Of Language Models Boxin Wang et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Vl-adapter: Parameter-efficient Transfer Learning For Vision-and-language Tasks Yi-lin Sung, Jaemin Cho, Mohit Bansal
- COCO-LM: Correcting And Contrasting Text Sequences For Language Model Pretraining Yu Meng et al.
- Medically Aware GPT-3 As A Data Generator For Medical Dialogue Summarization Bharath Chintagunta, Namit Katariya, Xavier Amatriain, Anitha Kannan
- Indicbart: A Pre-trained Model For Indic Natural Language Generation Raj Dabre et al.
- Neural Path Hunter: Reducing Hallucination In Dialogue Systems Via Path Grounding Nouha Dziri, Andrea Madotto, Osmar Zaiane, Avishek Joey Bose
- Ext5: Towards Extreme Multi-task Scaling For Transfer Learning Vamsi Aribandi et al.
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Maria: Spanish Language Models Asier Gutiérrez-fandiño et al.
- Learning To Retrieve Prompts For In-context Learning Ohad Rubin, Jonathan Herzig, Jonathan Berant
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- Revisiting Self-training For Few-shot Learning Of Language Model Yiming Chen et al.
- Baleen: Robust Multi-hop Reasoning At Scale Via Condensed Retrieval Omar Khattab, Christopher Potts, Matei Zaharia
- Hindsight: Posterior-guided Training Of Retrievers For Improved Open-ended Generation Ashwin Paranjape, Omar Khattab, Christopher Potts, Matei Zaharia, Christopher D. Manning
- Muppet: Massive Multi-task Representations With Pre-finetuning Armen Aghajanyan et al.
- HTLM: Hyper-text Pre-training And Prompting Of Language Models Armen Aghajanyan et al.
- PPT: Pre-trained Prompt Tuning For Few-shot Learning Yuxian Gu, Xu Han, Zhiyuan Liu, Minlie Huang
- ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training For Language Understanding And Generation Yu Sun et al.
- GALAXY: A Generative Pre-trained Model For Task-oriented Dialog With Semi-supervised Learning And Explicit Policy Injection Wanwei He et al.
- Prompt-learning For Fine-grained Entity Typing Ning Ding et al.
- GLM: General Language Model Pretraining With Autoregressive Blank Infilling Zhengxiao Du et al.
- Sustainable Modular Debiasing Of Language Models Anne Lauscher, Tobias Lüken, Goran Glavaš
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- Simvlm: Simple Visual Language Model Pretraining With Weak Supervision Zirui Wang et al.
- General-purpose Question-answering With Macaw Oyvind Tafjord, Peter Clark
- Denseclip: Language-guided Dense Prediction With Context-aware Prompting Yongming Rao et al.
- Scalable And Efficient Moe Training For Multitask Multilingual Models Young Jin Kim et al.
- Few-shot Bot: Prompt-based Learning For Dialogue Systems Andrea Madotto, Zhaojiang Lin, Genta Indra Winata, Pascale Fung
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Multi-task Pre-training For Plug-and-play Task-oriented Dialogue System Yixuan Su et al.
- MWP-BERT: Numeracy-augmented Pre-training For Math Word Problem Solving Zhenwen Liang et al.
- Tacl: Improving BERT Pre-training With Token-aware Contrastive Learning Yixuan Su et al.
- A General Language Assistant As A Laboratory For Alignment Amanda Askell et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- FLAVA: A Foundational Language And Vision Alignment Model Amanpreet Singh et al.
- Few-shot Question Answering By Pretraining Span Selection Ori Ram, Yuval Kirstain, Jonathan Berant, Amir Globerson, Omer Levy
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- Clip-adapter: Better Vision-language Models With Feature Adapters Peng Gao et al.
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- One Question Answering Model For Many Languages With Cross-lingual Dense Passage Retrieval Akari Asai, Xinyan Yu, Jungo Kasai, Hannaneh Hajishirzi
- Debertav3: Improving Deberta Using Electra-style Pre-training With Gradient-disentangled Embedding Sharing Pengcheng He, Jianfeng Gao, Weizhu Chen
- Symbolic Knowledge Distillation: From General Language Models To Commonsense Models Peter West et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Learned Token Pruning For Transformers Sehoon Kim et al.
- Training Large-scale News Recommenders With Pretrained Language Models In The Loop Shitao Xiao, Zheng Liu, Yingxia Shao, Tao Di, Xing Xie
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- GPT Understands, Too Xiao Liu et al.
- Pretrained Language Models For Text Generation: A Survey Junyi Li, Tianyi Tang, Wayne Xin Zhao, Ji-rong Wen
- Multi-modal Understanding And Generation For Medical Images And Text Via Vision-language Pre-training Jong Hak Moon, Hyungyung Lee, Woncheol Shin, Young-hak Kim, Edward Choi
- Unlocking Compositional Generalization In Pre-trained Models Using Intermediate Representations Jonathan Herzig et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Open Domain Question Answering Over Tables Via Dense Retrieval Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
- Cutting Down On Prompts And Parameters: Simple Few-shot Learning With Language Models Robert L. Iv Logan et al.
- FLEX: Unifying Evaluation For Few-shot NLP Jonathan Bragg, Arman Cohan, Kyle Lo, Iz Beltagy
- Fine-tuning Large Neural Language Models For Biomedical Natural Language Processing Robert Tinn et al.
- How Many Data Points Is A Prompt Worth? Teven Le Scao, Alexander M. Rush
- OPT: Omni-perception Pre-trainer For Cross-modal Understanding And Generation Jing Liu et al.
- Improving Coherence And Consistency In Neural Sequence Models With Dual-system, Neuro-symbolic Reasoning Maxwell Nye, Michael Henry Tessler, Joshua B. Tenenbaum, Brenden M. Lake
- Using Adversarial Attacks To Reveal The Statistical Bias In Machine Reading Comprehension Models Jieyu Lin, Jiajie Zou, Nai Ding
- Dialoglm: Pre-trained Model For Long Dialogue Understanding And Summarization Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Tip-adapter: Training-free Clip-adapter For Better Vision-language Modeling Renrui Zhang et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- Long Text Generation By Modeling Sentence-level And Discourse-level Coherence Jian Guan et al.
- P-tuning V2: Prompt Tuning Can Be Comparable To Fine-tuning Universally Across Scales And Tasks Xiao Liu et al.
- UC2: Universal Cross-lingual Cross-modal Vision-and-language Pre-training Mingyang Zhou et al.
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- Webgpt: Browser-assisted Question-answering With Human Feedback Reiichiro Nakano et al.
- Recursively Summarizing Books With Human Feedback Jeff Wu et al.
- Adapting Language Models For Zero-shot Learning By Meta-tuning On Dataset And Prompt Collections Ruiqi Zhong, Kristy Lee, Zheng Zhang, Dan Klein
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- WARP: Word-level Adversarial Reprogramming Karen Hambardzumyan, Hrant Khachatrian, Jonathan May
- Gpt3mix: Leveraging Large-scale Language Models For Text Augmentation Kang Min Yoo, Dongju Park, Jaewook Kang, Sang-woo Lee, Woomyeong Park
- Hurdles To Progress In Long-form Question Answering Kalpesh Krishna, Aurko Roy, Mohit Iyyer
- Learning To Prompt For Vision-language Models Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Simple Recipe For Multilingual Grammatical Error Correction Sascha Rothe, Jonathan Mallinson, Eric Malmi, Sebastian Krause, Aliaksei Severyn
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Decoupling Knowledge From Memorization: Retrieval-augmented Prompt Learning Xiang Chen et al.
- Language Models As Agent Models Jacob Andreas
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Linearly Mapping From Image To Text Space Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
- Diverse Demonstrations Improve In-context Compositional Generalization Itay Levy, Ben Bogin, Jonathan Berant
- Explanations From Large Language Models Make Small Reasoners Better Shiyang Li et al.
- Coderl: Mastering Code Generation Through Pretrained Models And Deep Reinforcement Learning Hung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C. H. Hoi
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- Biobart: Pretraining And Evaluation Of A Biomedical Generative Language Model Hongyi Yuan et al.
- One Embedder, Any Task: Instruction-finetuned Text Embeddings Hongjin Su et al.
- Revisiting The "video" In Video-language Understanding Shyamal Buch et al.
- Repair Is Nearly Generation: Multilingual Program Repair With Llms Harshit Joshi et al.
- Interactive And Visual Prompt Engineering For Ad-hoc Task Adaptation With Large Language Models Hendrik Strobelt et al.
- Gpt-neox-20b: An Open-source Autoregressive Language Model Sid Black et al.
- Black-box Prompt Learning For Pre-trained Language Models Shizhe Diao et al.
- Interleaving Retrieval With Chain-of-thought Reasoning For Knowledge-intensive Multi-step Questions Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
- Few-shot Parameter-efficient Fine-tuning Is Better And Cheaper Than In-context Learning Haokun Liu et al.
- Prompt Tuning For Generative Multimodal Pretrained Models Hao Yang et al.
- Uni-perceiver V2: A Generalist Model For Large-scale Vision And Vision-language Tasks Hao Li et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- Rethinking With Retrieval: Faithful Large Language Model Inference Hangfeng He, Hongming Zhang, Dan Roth
- OPT-IML: Scaling Language Model Instruction Meta Learning Through The Lens Of Generalization Srinivasan Iyer et al.
- Ask Me Anything: A Simple Strategy For Prompting Language Models Simran Arora et al.
- Large Language Models And The Reverse Turing Test Terrence Sejnowski
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Dylora: Parameter Efficient Tuning Of Pre-trained Models Using Dynamic Search-free Low-rank Adaptation Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi
- Healthprompt: A Zero-shot Learning Paradigm For Clinical Natural Language Processing Sonish Sivarajkumar, Yanshan Wang
- Smoothquant: Accurate And Efficient Post-training Quantization For Large Language Models Guangxuan Xiao et al.
- Revisiting Parameter-efficient Tuning: Are We Really There Yet? Guanzheng Chen, Fangyu Liu, Zaiqiao Meng, Shangsong Liang
- Altclip: Altering The Language Encoder In CLIP For Extended Language Capabilities Zhongzhi Chen et al.
- Atlas: Few-shot Learning With Retrieval Augmented Language Models Gautier Izacard et al.
- Data Augmentation For Intent Classification With Off-the-shelf Large Language Models Gaurav Sahu et al.
- Prototypical Verbalizer For Prompt-based Few-shot Tuning Ganqu Cui, Shengding Hu, Ning Ding, Longtao Huang, Zhiyuan Liu
- Synchromesh: Reliable Code Generation From Pre-trained Language Models Gabriel Poesia et al.
- On The Transferability Of Pre-trained Language Models For Low-resource Programming Languages Fuxiang Chen, Fatemeh Fard, David Lo, Timofey Bryksin
- CREPE: Can Vision-language Foundation Models Reason Compositionally? Zixian Ma et al.
- Pangu-coder: Program Synthesis With Function-level Language Modeling Fenia Christopoulou et al.
- Visual-language Navigation Pretraining Via Prompt-based Environmental Self-exploration Xiwen Liang, Fengda Zhu, Lingling Li, Hang Xu, Xiaodan Liang
- Vision-language Intelligence: Tasks, Representation Learning, And Large Models Feng Li et al.
- SKILL: Structured Knowledge Infusion For Large Language Models Fedor Moiseev, Zhe Dong, Enrique Alfonseca, Martin Jaggi
- NLX-GPT: A Model For Natural Language Explanations In Vision And Vision-language Tasks Fawaz Sammani, Tanmoy Mukherjee, Nikos Deligiannis
- Deplot: One-shot Visual Language Reasoning By Plot-to-table Translation Fangyu Liu et al.
- Legal Prompting: Teaching A Language Model To Think Like A Lawyer Fangyi Yu, Lee Quartey, Frank Schilder
- Red Teaming Language Models With Language Models Ethan Perez et al.
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Codegen: An Open Large Language Model For Code With Multi-turn Program Synthesis Erik Nijkamp et al.
- Sgva-clip: Semantic-guided Visual Adapting Of Vision-language Models For Few-shot Image Classification Fang Peng, Xiaoshan Yang, Linhui Xiao, Yaowei Wang, Changsheng Xu
- Star: Bootstrapping Reasoning With Reasoning Eric Zelikman, Yuhuai Wu, Jesse Mu, Noah D. Goodman
- Capturing Failures Of Large Language Models Via Human Cognitive Biases Erik Jones, Jacob Steinhardt
- M6-rec: Generative Pretrained Language Models Are Open-ended Recommender Systems Zeyu Cui, Jianxin Ma, Chang Zhou, Jingren Zhou, Hongxia Yang
- Vl-checklist: Evaluating Pre-trained Vision-language Models With Objects, Attributes And Relations Tiancheng Zhao et al.
- Structured Like A Language Model: Analysing AI As An Automated Subject Liam Magee, Vanicka Arora, Luke Munn
- Real Or Fake Text?: Investigating Human Ability To Detect Boundaries Between Human-written And Machine-generated Text Liam Dugan, Daphne Ippolito, Arun Kirubarajan, Sherry Shi, Chris Callison-burch
- Inner Monologue: Embodied Reasoning Through Planning With Language Models Wenlong Huang et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- The Goldilocks Of Pragmatic Understanding: Fine-tuning Strategy Matters For Implicature Resolution By Llms Laura Ruis et al.
- Exploring The Universal Vulnerability Of Prompt-based Learning Paradigm Lei Xu, Yangyi Chen, Ganqu Cui, Hongcheng Gao, Zhiyuan Liu
- Efficient Training Of Language Models To Fill In The Middle Mohammad Bavarian et al.
- Memorization Without Overfitting: Analyzing The Training Dynamics Of Large Language Models Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan
- Blenderbot 3: A Deployed Conversational Agent That Continually Learns To Responsibly Engage Kurt Shuster et al.
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Alexatm 20B: Few-shot Learning Using A Large-scale Multilingual Seq2seq Model Saleh Soltan et al.
- Speechprompt: An Exploration Of Prompt Tuning On Generative Spoken Language Model For Speech Processing Tasks Kai-wei Chang, Wei-cheng Tseng, Shang-wen Li, Hung-yi Lee
- REVEAL: Retrieval-augmented Visual-language Pre-training With Multi-source Multimodal Knowledge Memory Ziniu Hu et al.
- BLIP: Bootstrapping Language-image Pre-training For Unified Vision-language Understanding And Generation Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
- Deepspeed-moe: Advancing Mixture-of-experts Inference And Training To Power Next-generation AI Scale Samyam Rajbhandari et al.
- Language Models (mostly) Know What They Know Saurav Kadavath et al.
- Visual Programming: Compositional Visual Reasoning Without Training Tanmay Gupta, Aniruddha Kembhavi
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Do Language Models Plagiarize? Jooyoung Lee, Thai Le, Jinghui Chen, Dongwon Lee
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- On The Effect Of Pretraining Corpora On In-context Learning By A Large-scale Language Model Seongjin Shin et al.
- Coditt5: Pretraining For Source Code And Natural Language Editing Jiyang Zhang, Sheena Panthaplackel, Pengyu Nie, Junyi Jessy Li, Milos Gligoric
- Evolution Through Large Models Joel Lehman et al.
- Generating Sequences By Learning To Self-correct Sean Welleck et al.
- CLIP-TD: CLIP Targeted Distillation For Vision-language Tasks Zhecan Wang et al.
- Fine-tuned Language Models Are Continual Learners Thomas Scialom, Tuhin Chakrabarty, Smaranda Muresan
- What Do Llms Know About Financial Markets? A Case Study On Reddit Market Sentiment Analysis Xiang Deng, Vasilisa Bashlovkina, Feng Han, Simon Baumgartner, Michael Bendersky
- Diffusion-lm Improves Controllable Text Generation Xiang Lisa Li, John Thickstun, Ishaan Gulrajani, Percy Liang, Tatsunori B. Hashimoto
- Contrastive Decoding: Open-ended Text Generation As Optimization Xiang Lisa Li et al.
- Large Language Models Can Self-improve Jiaxin Huang et al.
- Controllable Natural Language Generation With Contrastive Prefixes Jing Qian, Li Dong, Yelong Shen, Furu Wei, Weizhu Chen
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- Knowledge Prompting In Pre-trained Language Model For Natural Language Understanding Jianing Wang et al.
- Benchmarking Large Language Models For Automated Verilog RTL Code Generation Shailja Thakur et al.
- Robotic Skill Acquisition Via Instruction Augmentation With Vision-language Models Ted Xiao et al.
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Enabling Multimodal Generation On CLIP Via Vision-language Knowledge Distillation Wenliang Dai et al.
- Adapting Pre-trained Language Models To African Languages Via Multilingual Adaptive Fine-tuning Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow
- Recommendation As Language Processing (RLP): A Unified Pretrain, Personalized Prompt & Predict Paradigm (P5) Shijie Geng, Shuchang Liu, Zuohui Fu, Yingqiang Ge, Yongfeng Zhang
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- News Summarization And Evaluation In The Era Of GPT-3 Tanya Goyal, Junyi Jessy Li, Greg Durrett
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Gpt-3-driven Pedagogical Agents For Training Children's Curious Question-asking Skills Rania Abdelghani et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Neural Theory-of-mind? On The Limits Of Social Intelligence In Large Lms Maarten Sap, Ronan Lebras, Daniel Fried, Yejin Choi
- RARR: Researching And Revising What Language Models Say, Using Language Models Luyu Gao et al.
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Instructionner: A Multi-task Instruction-based Generative Framework For Few-shot NER Liwen Wang et al.
- Language Models As Zero-shot Planners: Extracting Actionable Knowledge For Embodied Agents Wenlong Huang, Pieter Abbeel, Deepak Pathak, Igor Mordatch
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Training Language Models To Follow Instructions With Human Feedback Long Ouyang et al.
- Deep Bidirectional Language-knowledge Graph Pretraining Michihiro Yasunaga et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Is Reinforcement Learning (not) For Natural Language Processing: Benchmarks, Baselines, And Building Blocks For Natural Language Policy Optimization Rajkumar Ramamurthy et al.
- Visual Prompt Tuning Menglin Jia et al.
- Reproducible Scaling Laws For Contrastive Language-image Learning Mehdi Cherti et al.
- GPT Takes The Bar Exam Michael Ii Bommarito, Daniel Martin Katz
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Rlprompt: Optimizing Discrete Text Prompts With Reinforcement Learning Mingkai Deng et al.
- KALA: Knowledge-augmented Language Model Adaptation Minki Kang, Jinheon Baek, Sung Ju Hwang
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Mixgen: A New Multi-modal Data Augmentation Xiaoshuai Hao et al.
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- A Generative Language Model For Few-shot Aspect-based Sentiment Analysis Ehsan Hosseini-asl, Wenhao Liu, Caiming Xiong
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- Texts As Images In Prompt Tuning For Multi-label Image Recognition Zixian Guo et al.
- Prompt Distribution Learning Yuning Lu, Jianzhuang Liu, Yonggang Zhang, Yajing Liu, Xinmei Tian
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- LAVIS: A Library For Language-vision Intelligence Dongxu Li et al.
- Legal Prompt Engineering For Multilingual Legal Judgement Prediction Dietrich Trautmann, Alina Petrova, Frank Schilder
- Lm-nav: Robotic Navigation With Large Pre-trained Models Of Language, Vision, And Action Dhruv Shah, Blazej Osinski, Brian Ichter, Sergey Levine
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- IGLUE: A Benchmark For Transfer Learning Across Modalities, Tasks, And Languages Emanuele Bugliarello et al.
- The Stack: 3 TB Of Permissively Licensed Source Code Denis Kocetkov et al.
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Least-to-most Prompting Enables Complex Reasoning In Large Language Models Denny Zhou et al.
- Improving Passage Retrieval With Zero-shot Question Generation Devendra Singh Sachan et al.
- Adaprompt: Adaptive Model Training For Prompt-based NLP Yulong Chen et al.
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Factpegasus: Factuality-aware Pre-training And Fine-tuning For Abstractive Summarization David Wan, Mohit Bansal
- A Unified End-to-end Retriever-reader Framework For Knowledge-based VQA Yangyang Guo et al.
- CERT: Continual Pre-training On Sketches For Library-oriented Code Generation Daoguang Zan et al.
- Large Language Models Meet Nl2code: A Survey Daoguang Zan et al.
- Promda: Prompt-based Data Augmentation For Low-resource NLU Tasks Yufei Wang et al.
- Scaling Laws And Interpretability Of Learning From Repeated Data Danny Hernandez et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Scaling Language-image Pre-training Via Masking Yanghao Li, Haoqi Fan, Ronghang Hu, Christoph Feichtenhofer, Kaiming He
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Code4struct: Code Generation For Few-shot Event Structure Prediction Xingyao Wang, Sha Li, Heng Ji
- Discovering Latent Knowledge In Language Models Without Supervision Collin Burns, Haotian Ye, Dan Klein, Jacob Steinhardt
- Noisytune: A Little Noise Can Help You Finetune Pretrained Language Models Better Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Calibrating Sequence Likelihood Improves Conditional Language Generation Yao Zhao et al.
- DS-1000: A Natural And Reliable Benchmark For Data Science Code Generation Yuhang Lai et al.
- LAION-5B: An Open Large-scale Dataset For Training Next Generation Image-text Models Christoph Schuhmann et al.
- Binding Language Models In Symbolic Languages Zhoujun Cheng et al.
- Cont: Contrastive Neural Text Generation Chenxin An et al.
- EVA2.0: Investigating Open-domain Chinese Dialogue Systems With Large-scale Pre-training Yuxian Gu et al.
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Audiolm: A Language Modeling Approach To Audio Generation Zalán Borsos et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Impact Of Pretraining Term Frequencies On Few-shot Reasoning Yasaman Razeghi, Robert L. Iv Logan, Matt Gardner, Sameer Singh
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Learning Vector-quantized Item Representation For Transferable Sequential Recommenders Yupeng Hou, Zhankui He, Julian Mcauley, Wayne Xin Zhao
- Llm-planner: Few-shot Grounded Planning For Embodied Agents With Large Language Models Chan Hee Song et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- Learning Video Representations From Large Language Models Yue Zhao, Ishan Misra, Philipp Krähenbühl, Rohit Girdhar
- In-context Learning And Induction Heads Catherine Olsson et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- No More Fine-tuning? An Experimental Evaluation Of Prompt Tuning In Code Intelligence Chaozheng Wang et al.
- Enabling Conversational Interaction With Mobile UI Using Large Language Models Bryan Wang, Gang Li, Yang Li
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Exploring The Limits Of Domain-adaptive Training For Detoxifying Large-scale Language Models Boxin Wang et al.
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Long Time No See! Open-domain Conversation With Long-term Persona Memory Xinchao Xu et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- Thinking About GPT-3 In-context Learning For Biomedical IE? Think Again Bernal Jiménez Gutiérrez et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- Prompt-aligned Gradient For Prompt Tuning Beier Zhu, Yulei Niu, Yucheng Han, Yue Wu, Hanwang Zhang
- Promptagator: Few-shot Dense Retrieval From 8 Examples Zhuyun Dai et al.
- GODEL: Large-scale Pre-training For Goal-directed Dialog Baolin Peng et al.
- Dialog Inpainting: Turning Documents Into Dialogs Zhuyun Dai et al.
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Generative Language Models For Paragraph-level Question Generation Asahi Ushio, Fernando Alva-manchego, Jose Camacho-collados
- LERT: A Linguistically-motivated Pre-trained Language Model Yiming Cui, Wanxiang Che, Shijin Wang, Ting Liu
- Zero-shot Video Question Answering Via Frozen Bidirectional Language Models Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Selection-inference: Exploiting Large Language Models For Interpretable Logical Reasoning Antonia Creswell, Murray Shanahan, Irina Higgins
- GLM-130B: An Open Bilingual Pre-trained Model Aohan Zeng et al.
- Mslam: Massively Multilingual Joint Pre-training For Speech And Text Ankur Bapna et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- Prompt Tuning For Discriminative Pre-trained Language Models Yuan Yao et al.
- Generating Training Data With Language Models: Towards Zero-shot Language Understanding Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei Han
- Internet-augmented Language Models Through Few-shot Prompting For Open-domain Question Answering Angeliki Lazaridou, Elena Gribovskaya, Wojciech Stokowiec, Nikolai Grigorev
- Plug-and-play VQA: Zero-shot VQA By Conjoining Large Pretrained Models With Zero Training Anthony Meng Huat Tiong, Junnan Li, Boyang Li, Silvio Savarese, Steven C. H. Hoi
- Active Example Selection For In-context Learning Yiming Zhang, Shi Feng, Chenhao Tan
- Compositional Semantic Parsing With Large Language Models Andrew Drozdov et al.
- Personalized Prompt For Sequential Recommendation Yiqing Wu et al.
- Contrastive Search Is What You Need For Neural Text Generation Yixuan Su, Nigel Collier
- Multimodal Knowledge Alignment With Reinforcement Learning Youngjae Yu et al.
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- Memory-assisted Prompt Editing To Improve GPT-3 After Deployment Aman Madaan, Niket Tandon, Peter Clark, Yiming Yang
- WANLI: Worker And AI Collaboration For Natural Language Inference Dataset Creation Alisa Liu, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
- Language Models Can See: Plugging Visual Controls In Text Generation Yixuan Su et al.
- Should You Mask 15% In Masked Language Modeling? Alexander Wettig, Tianyu Gao, Zexuan Zhong, Danqi Chen
- PEVL: Position-enhanced Pre-training And Prompt Tuning For Vision-language Models Yuan Yao et al.
- IDPG: An Instance-dependent Prompt Generation Method Zhuofeng Wu et al.
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Position-guided Text Prompt For Vision-language Pre-training Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan
- Standing On The Shoulders Of Giant Frozen Language Models Yoav Levine et al.
- ATTEMPT: Parameter-efficient Multi-task Tuning Via Attentional Mixtures Of Soft Prompts Akari Asai, Mohammadreza Salehi, Matthew E. Peters, Hannaneh Hajishirzi
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- UL2: Unifying Language Learning Paradigms Yi Tay et al.
- LASP: Text-to-text Optimization For Language-aware Soft Prompting Of Vision & Language Models Adrian Bulat, Georgios Tzimiropoulos
- Towards The Next 1000 Languages In Multilingual Machine Translation: Exploring The Synergy Between Supervised And Self-supervised Learning Aditya Siddhant et al.
- Scaling Up Models And Data With \(\texttt{t5x}\) And \(\texttt{seqio}\) Adam Roberts et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Prompt For Extraction? PAIE: Prompting Argument Interaction For Event Argument Extraction Yubo Ma et al.
- Optimizing Prompts For Text-to-image Generation Yaru Hao, Zewen Chi, Li Dong, Furu Wei
- Emergent Analogical Reasoning In Large Language Models Taylor Webb, Keith J. Holyoak, Hongjing Lu
- Make-a-video: Text-to-video Generation Without Text-video Data Uriel Singer et al.
- PINTO: Faithful Language Reasoning Using Prompt-generated Rationales Peifeng Wang, Aaron Chan, Filip Ilievski, Muhao Chen, Xiang Ren
- Dynamic Prompt Learning Via Policy Gradient For Semi-structured Mathematical Reasoning Pan Lu et al.
- Unnatural Instructions: Tuning Language Models With (almost) No Human Labor Or Honovich, Thomas Scialom, Omer Levy, Timo Schick
- LIFT: Language-interfaced Fine-tuning For Non-language Machine Learning Tasks Tuan Dinh et al.
- OFA: Unifying Architectures, Tasks, And Modalities Through A Simple Sequence-to-sequence Learning Framework Peng Wang et al.
- Language Models With Image Descriptors Are Strong Few-shot Video-language Learners Zhenhailong Wang et al.
- Help Me Write A Poem: Instruction Tuning As A Vehicle For Collaborative Poetry Writing Tuhin Chakrabarty, Vishakh Padmakumar, He He
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- Grounding Language With Visual Affordances Over Unstructured Data Oier Mees, Jessica Borja-diaz, Wolfram Burgard
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- On The Origin Of Hallucinations In Conversational Models: Is It The Datasets Or The Models? Nouha Dziri, Sivan Milton, Mo Yu, Osmar Zaiane, Siva Reddy
- Faithdial: A Faithful Benchmark For Information-seeking Dialogue Nouha Dziri et al.
- Measuring And Narrowing The Compositionality Gap In Language Models Ofir Press et al.
- No Language Left Behind: Scaling Human-centered Machine Translation Nllb Team et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- Delta Tuning: A Comprehensive Study Of Parameter Efficient Methods For Pre-trained Language Models Ning Ding et al.
- Crosslingual Generalization Through Multitask Finetuning Niklas Muennighoff et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- Large Language Models Struggle To Learn Long-tail Knowledge Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, Colin Raffel
- Learning To Compose Soft Prompts For Compositional Zero-shot Learning Nihal V. Nayak, Peilin Yu, Stephen H. Bach
- Quantifying Memorization Across Neural Language Models Nicholas Carlini et al.
- Clinical Prompt Learning With Frozen Language Models Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-holgado, Andrey Kormilitzin
- Large Language Models Are Reasoning Teachers Namgyu Ho, Laura Schmid, Se-young Yun
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Quark: Controllable Text Generation With Reinforced Unlearning Ximing Lu et al.
- Do Llms Understand Social Knowledge? Evaluating The Sociability Of Large Language Models With Socket Benchmark Minje Choi, Jiaxin Pei, Sagar Kumar, Chang Shu, David Jurgens
- A Simple And Effective Pruning Approach For Large Language Models Mingjie Sun, Zhuang Liu, Anna Bair, J. Zico Kolter
- Api-bank: A Comprehensive Benchmark For Tool-augmented Llms Minghao Li et al.
- Reward Design With Language Models Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh
- Scalable Extraction Of Training Data From (production) Language Models Milad Nasr et al.
- Lamini-lm: A Diverse Herd Of Distilled Models From Large-scale Instructions Minghao Wu, Abdul Waheed, Chiyu Zhang, Muhammad Abdul-mageed, Alham Fikri Aji
- Med-flamingo: A Multimodal Medical Few-shot Learner Michael Moor et al.
- A Large Language Model Approach To Educational Survey Feedback Analysis Michael J. Parker, Caitlin Anderson, Claire Stone, Yearim Oh
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Chatgpt For Vulnerability Detection, Classification, And Repair: How Far Are We? Michael Fu, Chakkrit Tantithamthavorn, Van Nguyen, Trung Le
- Can Llms Express Their Uncertainty? An Empirical Evaluation Of Confidence Elicitation In Llms Miao Xiong et al.
- Distilling Large Language Models For Matching Patients To Clinical Trials Mauro Nievas, Aditya Basu, Yanshan Wang, Hrituraj Singh
- An Empirical Evaluation Of Using Large Language Models For Automated Unit Test Generation Max Schäfer, Sarah Nadi, Aryaz Eghbali, Frank Tip
- Few-shot Fine-tuning Vs. In-context Learning: A Fair Comparison And Evaluation Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar
- Fine-grained Human Feedback Gives Better Rewards For Language Model Training Zeqiu Wu et al.
- LLM Self Defense: By Self Examination, Llms Know They Are Being Tricked Mansi Phute et al.
- The Reversal Curse: Llms Trained On "A Is B" Fail To Learn "B Is A" Lukas Berglund et al.
- Document-level Machine Translation With Large Language Models Longyue Wang et al.
- Driving With Llms: Fusing Object-level Vector Modality For Explainable Autonomous Driving Long Chen et al.
- Reasoning On Graphs: Faithful And Interpretable Large Language Model Reasoning Linhao Luo, Yuan-fang Li, Gholamreza Haffari, Shirui Pan
- Taiyi: A Bilingual Fine-tuned Large Language Model For Diverse Biomedical Tasks Ling Luo et al.
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- A Survey On Large Language Models For Recommendation Likang Wu et al.
- Scaling Autoregressive Multi-modal Models: Pretraining And Instruction Tuning Lili Yu et al.
- Improving CLIP Training With Language Rewrites Lijie Fan, Dilip Krishnan, Phillip Isola, Dina Katabi, Yonglong Tian
- Flexkbqa: A Flexible Llm-powered Framework For Few-shot Knowledge Base Question Answering Zhenyu Li et al.
- Query2doc: Query Expansion With Large Language Models Liang Wang, Nan Yang, Furu Wei
- Improving Text Embeddings With Large Language Models Liang Wang et al.
- Zephyr: Direct Distillation Of LM Alignment Lewis Tunstall et al.
- ULIP-2: Towards Scalable Multimodal Pre-training For 3D Understanding Le Xue et al.
- Zero-shot Next-item Recommendation Using Large Pretrained Language Models Lei Wang, Ee-peng Lim
- Dissociating Language And Thought In Large Language Models Kyle Mahowald et al.
- Mvbench: A Comprehensive Multi-modal Video Understanding Benchmark Kunchang Li et al.
- Automatically Correcting Large Language Models: Surveying The Landscape Of Diverse Self-correction Strategies Liangming Pan et al.
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- Just Tell Me: Prompt Engineering In Business Process Management Kiran Busch, Alexander Rochlitzer, Diana Sola, Henrik Leopold
- Tallrec: An Effective And Efficient Tuning Framework To Align Large Language Model With Recommendation Keqin Bao et al.
- Large Language Models And Simple, Stupid Bugs Kevin Jesse, Toufique Ahmed, Premkumar T. Devanbu, Emily Morgan
- Speak, Memory: An Archaeology Of Books Known To Chatgpt/gpt-4 Kent K. Chang, Mackenzie Cramer, Sandeep Soni, David Bamman
- Automating Customer Service Using Langchain: Building Custom Open-source GPT Chatbot For Organizations Keivalya Pandya, Mehfuza Holia
- Just Ask For Calibration: Strategies For Eliciting Calibrated Confidence Scores From Language Models Fine-tuned With Human Feedback Katherine Tian et al.
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- Biomedical Knowledge Graph-optimized Prompt Generation For Large Language Models Karthik Soman et al.
- Tinyclip: CLIP Distillation Via Affinity Mimicking And Weight Inheritance Kan Stephen Wu et al.
- Chipgpt: How Far Are We From Natural Language Hardware Design Kaiyan Chang et al.
- ALIP: Adaptive Language-image Pre-training With Synthetic Caption Kaicheng Yang et al.
- Aligning Instruction Tasks Unlocks Large Language Models As Zero-shot Relation Extractors Kai Zhang, Bernal Jiménez Gutiérrez, Yu Su
- Evaluation And Analysis Of Hallucination In Large Vision-language Models Junyang Wang et al.
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- Llama-reviewer: Advancing Code Review Automation With Large Language Models Through Parameter-efficient Fine-tuning Junyi Lu, Lei Yu, Xiaojia Li, Li Yang, Chun Zuo
- Transferable Decoding With Visual Entities For Zero-shot Image Captioning Junjie Fei et al.
- A Comprehensive Capability Analysis Of GPT-3 And GPT-3.5 Series Models Junjie Ye et al.
- Breaking The Silence: The Threats Of Using Llms In Software Engineering June Sallou, Thomas Durieux, Annibale Panichella
- Chatcounselor: A Large Language Models For Mental Health Support June M. Liu et al.
- Minigpt-v2: Large Language Model As A Unified Interface For Vision-language Multi-task Learning Jun Chen et al.
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- LERF: Language Embedded Radiance Fields Justin Kerr, Chung Min Kim, Ken Goldberg, Angjoo Kanazawa, Matthew Tancik
- Increasing Diversity While Maintaining Accuracy: Text Data Generation With Large Language Models And Human Interventions John Joon Young Chung, Ece Kamar, Saleema Amershi
- Exploring The Benefits Of Training Expert Language Models Over Instruction Tuning Joel Jang et al.
- Grounding Language Models To Images For Multimodal Inputs And Outputs Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- When Large Language Models Meet Personalization: Perspectives Of Challenges And Opportunities Jin Chen et al.
- Fake News In Sheep's Clothing: Robust Fake News Detection Against Llm-empowered Style Attacks Jiaying Wu, Jiafeng Guo, Bryan Hooi
- Prompt-and-align: Prompt-based Social Alignment For Few-shot Fake News Detection Jiaying Wu, Shen Li, Ailin Deng, Miao Xiong, Bryan Hooi
- Badgpt: Exploring Security Vulnerabilities Of Chatgpt Via Backdoor Attacks To Instructgpt Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun
- Empowering Molecule Discovery For Molecule-caption Translation With Large Language Models: A Chatgpt Perspective Jiatong Li et al.
- Bias And Fairness In Chatbots: An Overview Jintang Xue et al.
- Llm-grounder: Open-vocabulary 3D Visual Grounding With Large Language Model As An Agent Jianing Yang et al.
- Language Models Meet World Models: Embodied Experiences Enhance Language Models Jiannan Xiang et al.
- Think-on-graph: Deep And Responsible Reasoning Of Large Language Model On Knowledge Graph Jiashuo Sun et al.
- How Can Recommender Systems Benefit From Large Language Models: A Survey Jianghao Lin et al.
- Rella: Retrieval-enhanced Large Language Models For Lifelong Sequential Behavior Comprehension In Recommendation Jianghao Lin et al.
- On Decoder-only Architecture For Speech-to-text And Large Language Model Integration Jian Wu et al.
- Imagebind-llm: Multi-modality Instruction Tuning Jiaming Han et al.
- Ureader: Universal Ocr-free Visually-situated Language Understanding With Multimodal Large Language Model Jiabo Ye et al.
- Graphgpt: Graph Instruction Tuning For Large Language Models Jiabin Tang et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- ICL-D3IE: In-context Learning With Diverse Demonstrations Updating For Document Information Extraction Jiabang He et al.
- VILA: On Pre-training For Visual Language Models Ji Lin et al.
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Larger Language Models Do In-context Learning Differently Jerry Wei et al.
- Memory-efficient Fine-tuning Of Compressed Large Language Models Via Sub-4-bit Integer Quantization Jeonghoon Kim et al.
- Physically Grounded Vision-language Models For Robotic Manipulation Jensen Gao et al.
- Auditing Large Language Models: A Three-layered Approach Jakob Mökander, Jonas Schuett, Hannah Rose Kirk, Luciano Floridi
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- A Comprehensive Evaluation Of Large Language Models On Benchmark Biomedical Text Processing Tasks Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- The Curse Of Recursion: Training On Generated Data Makes Models Forget Ilia Shumailov et al.
- Llama 2: Open Foundation And Fine-tuned Chat Models Hugo Touvron et al.
- The Bigscience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset Hugo Laurençon et al.
- Doctorglm: Fine-tuning Your Chinese Doctor Is Not A Herculean Task Honglin Xiong et al.
- Bioinstruct: Instruction Tuning Of Large Language Models For Biomedical Natural Language Processing Hieu Tran, Zhichao Yang, Zonghai Yao, Hong Yu
- Large Language Models Can Infer Psychological Dispositions Of Social Media Users Heinrich Peters, Sandra Matz
- Capabilities Of GPT-4 On Medical Challenge Problems Harsha Nori, Nicholas King, Scott Mayer Mckinney, Dean Carignan, Eric Horvitz
- Can Generalist Foundation Models Outcompete Special-purpose Tuning? Case Study In Medicine Harsha Nori et al.
- Improved Baselines With Visual Instruction Tuning Haotian Liu, Chunyuan Li, Yuheng Li, Yong Jae Lee
- Is Chatgpt The Ultimate Programming Assistant -- How Far Is It? Haoye Tian et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Visual-language Prompt Tuning With Knowledge-guided Context Optimization Hantao Yao, Rui Zhang, Changsheng Xu
- Lasuie: Unifying Information Extraction With Latent Adaptive Structure-aware Generative Language Model Hao Fei et al.
- Extractive Summarization Via Chatgpt For Faithful Summary Generation Haopeng Zhang, Xiao Liu, Jiawei Zhang
- Personalisation Within Bounds: A Risk Taxonomy And Policy Framework For The Alignment Of Large Language Models With Personalised Feedback Hannah Rose Kirk, Bertie Vidgen, Paul Röttger, Scott A. Hale
- Video-llama: An Instruction-tuned Audio-visual Language Model For Video Understanding Hang Zhang, Xin Li, Lidong Bing
- Mplug-2: A Modularized Multi-modal Foundation Model Across Text, Image And Video Haiyang Xu et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Auggpt: Leveraging Chatgpt For Text Data Augmentation Haixing Dai et al.
- Llama Guard: Llm-based Input-output Safeguard For Human-ai Conversations Hakan Inan et al.
- The Refinedweb Dataset For Falcon LLM: Outperforming Curated Corpora With Web Data, And Web Data Only Guilherme Penedo et al.
- Dr Chatgpt, Tell Me What I Want To Hear: How Prompt Knowledge Impacts Health Answer Correctness Guido Zuccon, Bevan Koopman
- Gender Bias And Stereotypes In Large Language Models Hadas Kotek, Rikker Dockum, David Q. Sun
- Efficient Streaming Language Models With Attention Sinks Guangxuan Xiao, Yuandong Tian, Beidi Chen, Song Han, Mike Lewis
- Personality Traits In Large Language Models Greg Serapio-garcía et al.
- Level Generation Through Large Language Models Graham Todd, Sam Earle, Muhammad Umair Nasir, Michael Cerny Green, Julian Togelius
- Zhongjing: Enhancing The Chinese Medical Capabilities Of Large Language Model Through Expert Feedback And Real-world Multi-turn Dialogue Songhua Yang et al.
- Revisiting Relation Extraction In The Era Of Large Language Models Somin Wadhwa, Silvio Amir, Byron C. Wallace
- Thoughtsource: A Central Hub For Large Language Model Reasoning Data Simon Ott et al.
- From Words To Watts: Benchmarking The Energy Costs Of Large Language Model Inference Siddharth Samsi et al.
- Mariogpt: Open-ended Text2level Generation Through Large Language Models Shyam Sudhakaran et al.
- Mitigating Object Hallucinations In Large Vision-language Models Through Visual Contrastive Decoding Sicong Leng et al.
- A Survey On Multimodal Large Language Models Shukang Yin et al.
- Self-chained Image-language Model For Video Localization And Question Answering Shoubin Yu, Jaemin Cho, Prateek Yadav, Mohit Bansal
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Automl-gpt: Automatic Machine Learning With GPT Shujian Zhang, Chengyue Gong, Lemeng Wu, Xingchao Liu, Mingyuan Zhou
- Instruction Tuning For Large Language Models: A Survey Shengyu Zhang et al.
- Next-gpt: Any-to-any Multimodal LLM Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-seng Chua
- Large Language Model Augmented Narrative Driven Recommendations Sheshera Mysore, Andrew Mccallum, Hamed Zamani
- VIP5: Towards Multimodal Foundation Models For Recommendation Shijie Geng, Juntao Tan, Shuchang Liu, Zuohui Fu, Yongfeng Zhang
- Unifying Large Language Models And Knowledge Graphs: A Roadmap Shirui Pan et al.
- Mixture-of-experts Meets Instruction Tuning:a Winning Combination For Large Language Models Sheng Shen et al.
- Scaling Vision-language Models With Sparse Mixture Of Experts Sheng Shen et al.
- The Flan Collection: Designing Data And Methods For Effective Instruction Tuning Shayne Longpre et al.
- Evaluation Of Chatgpt Family Of Models For Biomedical Reasoning And Classification Shan Chen et al.
- Sur-adapter: Enhancing Text-to-image Pre-trained Diffusion Models With Large Language Models Shanshan Zhong, Zhongzhan Huang, Wushao Wen, Jinghui Qin, Liang Lin
- The Cot Collection: Improving Zero-shot And Few-shot Learning Of Language Models Via Chain-of-thought Fine-tuning Seungone Kim et al.
- The Moral Authority Of Chatgpt Sebastian Krügel, Andreas Ostermaier, Matthias Uhl
- A Comparative Study Of Open-source Large Language Models, GPT-4 And Claude 2: Multiple-choice Test Taking In Nephrology Sean Wu et al.
- Large Language Models Are Competitive Near Cold-start Recommenders For Language- And Item-based Preferences Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon
- Scalable Educational Question Generation With Pre-trained Language Models Sahan Bulathwela, Hamze Muse, Emine Yilmaz
- Fine-tuning Language Models With Just Forward Passes Sadhika Malladi et al.
- SPHINX: The Joint Mixing Of Weights, Tasks, And Visual Embeddings For Multi-modal Large Language Models Ziyi Lin et al.
- Pushing Large Language Models To The 6G Edge: Vision, Challenges, And Opportunities Zheng Lin et al.
- Does Synthetic Data Generation Of Llms Help Clinical Text Mining? Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, Xia Hu
- Secrets Of RLHF In Large Language Models Part I: PPO Rui Zheng et al.
- Gpt4tools: Teaching Large Language Model To Use Tools Via Self-instruction Rui Yang et al.
- Pro-cap: Leveraging A Frozen Vision-language Model For Hateful Meme Detection Rui Cao et al.
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Beyond Memorization: Violating Privacy Via Inference With Large Language Models Robin Staab, Mark Vero, Mislav Balunović, Martin Vechev
- Automatic Prompt Optimization With "gradient Descent" And Beam Search Reid Pryzant et al.
- Chatgpt Versus Traditional Question Answering For Knowledge Graphs: Current Status And Future Directions Towards Knowledge Graph Chatbots Reham Omar, Omij Mangukiya, Panos Kalnis, Essam Mansour
- Prompt, Generate, Then Cache: Cascade Of Foundation Models Makes Strong Few-shot Learners Renrui Zhang et al.
- Sabi\'a: Portuguese Large Language Models Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Lawyer Llama Technical Report Quzhe Huang et al.
- Direct Preference Optimization: Your Language Model Is Secretly A Reward Model Rafael Rafailov et al.
- Mplug-owl: Modularization Empowers Large Language Models With Multimodality Qinghao Ye et al.
- ONCE: Boosting Content-based Recommendation With Both Open- And Closed-source Large Language Models Qijiong Liu, Nuo Chen, Tetsuya Sakai, Xiao-ming Wu
- Adalora: Adaptive Budget Allocation For Parameter-efficient Fine-tuning Qingru Zhang et al.
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- Masked Vision And Language Pre-training With Unimodal And Multimodal Contrastive Losses For Medical Visual Question Answering Pengfei Li, Gang Liu, Jinlong He, Zixu Zhao, Shenjun Zhong
- Llama-adapter V2: Parameter-efficient Visual Instruction Model Peng Gao et al.
- Audiopalm: A Large Language Model That Can Speak And Listen Paul K. Rubenstein et al.
- Pre-train, Prompt And Recommendation: A Comprehensive Survey Of Language Modelling Paradigm Adaptations In Recommender Systems Peng Liu, Lemei Zhang, Jon Atle Gulla
- Internlm-xcomposer: A Vision-language Large Model For Advanced Text-image Comprehension And Composition Pan Zhang et al.
- In-context Retrieval-augmented Language Models Ori Ram et al.
- Drivegpt4: Interpretable End-to-end Autonomous Driving Via Large Language Model Zhenhua Xu et al.
- GPT-4 Technical Report Openai et al.
- Fine-tuning Or Retrieval? Comparing Knowledge Injection In Llms Oded Ovadia, Menachem Brief, Moshik Mishaeli, Oren Elisha
- Fusecap: Leveraging Large Language Models For Enriched Fused Image Captions Noam Rotstein, David Bensaid, Shaked Brody, Roy Ganz, Ron Kimmel
- Reflexion: Language Agents With Verbal Reinforcement Learning Noah Shinn et al.
- Large Language Models Are Built-in Autoregressive Search Engines Noah Ziems, Wenhao Yu, Zhihan Zhang, Meng Jiang
- Enhancing Chat Language Models By Scaling High-quality Instructional Conversations Ning Ding et al.
- Emergent And Predictable Memorization In Large Language Models Stella Biderman et al.
- LISA: Reasoning Segmentation Via Large Language Model Xin Lai et al.
- CAT-LM: Training Language Models On Aligned Code And Tests Nikitha Rao, Kush Jain, Uri Alon, Claire Le Goues, Vincent J. Hellendoorn
- Sources Of Hallucination By Large Language Models On Inference Tasks Nick Mckenna et al.
- Jais And Jais-chat: Arabic-centric Foundation And Instruction-tuned Open Generative Large Language Models Neha Sengupta et al.
- Self-regulating Prompts: Foundational Model Adaptation Without Forgetting Muhammad Uzair Khattak et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- Using Large Language Models To Generate Junit Tests: An Empirical Study Mohammed Latif Siddiq et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Scaling Down To Scale Up: A Guide To Parameter-efficient Fine-tuning Vladislav Lialin, Vijeta Deshpande, Xiaowei Yao, Anna Rumshisky
- Do Llms Understand User Preferences? Evaluating Llms On User Rating Prediction Wang-cheng Kang et al.
- Inpars-v2: Large Language Models As Efficient Dataset Generators For Information Retrieval Vitor Jeronymo et al.
- Chatgpt Beyond English: Towards A Comprehensive Evaluation Of Large Language Models In Multilingual Learning Viet Dac Lai et al.
- Evaluating Correctness And Faithfulness Of Instruction-following Models For Question Answering Vaibhav Adlakha, Parishad Behnamghader, Xing Han Lu, Nicholas Meade, Siva Reddy
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Nemo Guardrails: A Toolkit For Controllable And Safe LLM Applications With Programmable Rails Traian Rebedea, Razvan Dinu, Makesh Sreedhar, Christopher Parisien, Jonathan Cohen
- Generalized Planning In PDDL Domains With Pretrained Large Language Models Tom Silver et al.
- Open-ended Medical Visual Question Answering Through Prefix Tuning Of Language Models Tom Van Sonsbeek, Mohammad Mahdi Derakhshani, Ivona Najdenkoska, Cees G. M. Snoek, Marcel Worring
- Toolformer: Language Models Can Teach Themselves To Use Tools Timo Schick et al.
- Pretraining Language Models With Human Preferences Tomasz Korbak et al.
- Spqr: A Sparse-quantized Representation For Near-lossless LLM Weight Compression Tim Dettmers et al.
- Medalpaca -- An Open-source Collection Of Medical Conversational AI Models And Training Data Tianyu Han et al.
- Qlora: Efficient Finetuning Of Quantized Llms Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer
- Few-shot In-context Learning For Knowledge Base Question Answering Tianle Li et al.
- Having Beer After Prayer? Measuring Cultural Bias In Large Language Models Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Large Language Models As General Pattern Machines Suvir Mirchandani et al.
- Orca: Progressive Learning From Complex Explanation Traces Of GPT-4 Subhabrata Mukherjee et al.
- Pythia: A Suite For Analyzing Large Language Models Across Training And Scaling Stella Biderman et al.
- LAMM: Language-assisted Multi-modal Instruction-tuning Dataset, Framework, And Benchmark Zhenfei Yin et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- A Preliminary Evaluation Of Chatgpt For Zero-shot Dialogue Understanding Wenbo Pan, Qiguang Chen, Xiao Xu, Wanxiang Che, Libo Qin
- Is Chatgpt Good At Search? Investigating Large Language Models As Re-ranking Agents Weiwei Sun et al.
- Trusting Your Evidence: Hallucinate Less With Context-aware Decoding Weijia Shi et al.
- Medagents: Large Language Models As Collaborators For Zero-shot Medical Reasoning Xiangru Tang et al.
- Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To! Xiangyu Qi et al.
- HPC-GPT: Integrating Large Language Model For High-performance Computing Xianzhong Ding et al.
- Don't Trust Chatgpt When Your Question Is Not In English: A Study Of Multilingual Abilities And Types Of Llms Xiang Zhang, Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
- Alpha-clip: A CLIP Model Focusing On Wherever You Want Zeyi Sun et al.
- Navgpt: Explicit Reasoning In Vision-and-language Navigation With Large Language Models Gengze Zhou, Yicong Hong, Qi Wu
- Cheap And Quick: Efficient Vision-language Instruction Tuning For Large Language Models Gen Luo et al.
- Voyager: An Open-ended Embodied Agent With Large Language Models Guanzhi Wang et al.
- Gemini: A Family Of Highly Capable Multimodal Models Gemini Team et al.
- Performance Of The Pre-trained Large Language Model GPT-4 On Automated Short Answer Grading Gerd Kortemeyer
- The Rise And Potential Of Large Language Model Based Agents: A Survey Zhiheng Xi et al.
- LLMR: Real-time Prompting Of Interactive Worlds Using Large Language Models Fernanda De La Torre et al.
- Codegen2: Lessons For Training Llms On Programming And Natural Languages Erik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou
- Do We Still Need Clinical Language Models? Eric Lehman et al.
- Towards Efficient Fine-tuning Of Pre-trained Code Models: An Experimental Study And Beyond Ensheng Shi et al.
- Principle-driven Self-alignment Of Language Models From Scratch With Minimal Human Supervision Zhiqing Sun et al.
- Aligning Large Multimodal Models With Factually Augmented RLHF Zhiqing Sun et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Fine-tuning Chatgpt For Automatic Scoring Ehsan Latif, Xiaoming Zhai
- The Falcon Series Of Open Language Models Ebtesam Almazrouei et al.
- Speechgpt: Empowering Large Language Models With Intrinsic Cross-modal Conversational Abilities Dong Zhang et al.
- The Vector Grounding Problem Dimitri Coelho Mollo, Raphaël Millière
- One Adapter For All Programming Languages? Adapter Tuning For Code Search And Summarization Deze Wang et al.
- Promptner: Prompting For Named Entity Recognition Dhananjay Ashok, Zachary C. Lipton
- Evaluating GPT-3.5 And GPT-4 Models On Brazilian University Admission Exams Desnes Nunes, Ricardo Primi, Ramon Pires, Roberto Lotufo, Rodrigo Nogueira
- GPT-4 Can Pass The Korean National Licensing Examination For Korean Medicine Doctors Dongyeop Jang, Tae-rim Yun, Choong-yeol Lee, Young-kyu Kwon, Chang-eop Kim
- Text-to-sql Empowered By Large Language Models: A Benchmark Evaluation Dawei Gao et al.
- Improving Accuracy Of GPT-3/4 Results On Biomedical Data Using A Retrieval-augmented Language Model David Soong et al.
- The Capacity For Moral Self-correction In Large Language Models Deep Ganguli et al.
- Palm-e: An Embodied Multimodal Language Model Danny Driess et al.
- SOLAR 10.7B: Scaling Large Language Models With Simple Yet Effective Depth Up-scaling Dahyun Kim et al.
- Llava-med: Training A Large Language-and-vision Assistant For Biomedicine In One Day Chunyuan Li et al.
- LIMA: Less Is More For Alignment Chunting Zhou et al.
- Multimodal Foundation Models: From Specialists To General-purpose Assistants Chunyuan Li et al.
- An Iterative Optimizing Framework For Radiology Report Summarization With Chatgpt Chong Ma et al.
- A Study On The Implementation Of Generative AI Services Using An Enterprise Data-based LLM Application Architecture Cheonsu Jeong
- Llm-powered Data Augmentation For Enhanced Cross-lingual Performance Chenxi Whitehouse, Monojit Choudhury, Alham Fikri Aji
- Debiasing Vision-language Models Via Biased Prompts Ching-yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka
- Model Tuning Or Prompt Tuning? A Study Of Large Language Models For Clinical Concept And Relation Extraction Cheng Peng et al.
- K2: A Foundation Language Model For Geoscience Knowledge Understanding And Utilization Cheng Deng et al.
- Pmc-llama: Towards Building Open-source Language Models For Medicine Chaoyi Wu et al.
- Generative Speech Recognition Error Correction With Large Language Models And Task-activating Prompting Chao-han Huck Yang et al.
- Blackvip: Black-box Visual Prompting For Robust Transfer Learning Changdae Oh et al.
- Compositional Chain-of-thought Prompting For Large Multimodal Models Chancharik Mitra, Brandon Huang, Trevor Darrell, Roei Herzig
- A Confederacy Of Models: A Comprehensive Evaluation Of Llms On Creative Writing Carlos Gómez-rodríguez, Paul Williams
- Wizardlm: Empowering Large Language Models To Follow Complex Instructions Can Xu et al.
- Distilling Step-by-step! Outperforming Larger Language Models With Less Training Data And Smaller Model Sizes Cheng-yu Hsieh et al.
- Reinforced Self-training (rest) For Language Modeling Caglar Gulcehre et al.
- Adapting Large Language Models By Integrating Collaborative Semantics For Recommendation Bowen Zheng et al.
- Prompting Or Fine-tuning? A Comparative Study Of Large Language Models For Taxonomy Construction Boqi Chen, Fandi Yi, Dániel Varró
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- Can Large Language Models Transform Computational Social Science? Caleb Ziems et al.
- Vtimellm: Empower LLM To Grasp Video Moments Bin Huang, Xin Wang, Hong Chen, Zihan Song, Wenwu Zhu
- Motiongpt: Human Motion As A Foreign Language Biao Jiang et al.
- Prompting Large Language Model For Machine Translation: A Case Study Biao Zhang, Barry Haddow, Alexandra Birch
- Instruction Tuning With GPT-4 Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao
- Clinical Camel: An Open Expert-level Medical Language Model With Dialogue-based Knowledge Encoding Augustin Toma et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Supporting Qualitative Analysis With Large Language Models: Combining Codebook With GPT-3 For Deductive Coding Ziang Xiao, Xingdi Yuan, Q. Vera Liao, Rania Abdelghani, Pierre-yves Oudeyer
- Orca 2: Teaching Small Language Models How To Reason Arindam Mitra et al.
- RT-2: Vision-language-action Models Transfer Web Knowledge To Robotic Control Anthony Brohan et al.
- Expel: LLM Agents Are Experiential Learners Andrew Zhao et al.
- Synthetic Data Generation With Large Language Models For Text Classification: Potential And Limitations Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
- Opening Up Chatgpt: Tracking Openness, Transparency, And Accountability In Instruction-tuned Text Generators Andreas Liesenfeld, Alianda Lopez, Mark Dingemanse
- Fundamentals Of Generative Large Language Models And Perspectives In Cyber-defense Andrei Kucharavy et al.
- Openflamingo: An Open-source Framework For Training Large Autoregressive Vision-language Models Anas Awadalla et al.
- Openassistant Conversations -- Democratizing Large Language Model Alignment Andreas Köpf et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- Self-refine: Iterative Refinement With Self-feedback Aman Madaan et al.
- Jailbroken: How Does LLM Safety Training Fail? Alexander Wei, Nika Haghtalab, Jacob Steinhardt
- Lamp: When Large Language Models Meet Personalization Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
- Can Chatgpt Forecast Stock Price Movements? Return Predictability And Large Language Models Alejandro Lopez-lira, Yuehua Tang
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Baichuan 2: Open Large-scale Language Models Aiyuan Yang et al.
- Clipsyntel: CLIP And LLM Synergy For Multimodal Question Summarization In Healthcare Akash Ghosh et al.
- Calibrated Language Models Must Hallucinate Adam Tauman Kalai, Santosh S. Vempala
- Should Chatgpt Be Biased? Challenges And Risks Of Bias In Large Language Models Emilio Ferrara
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- Pandagpt: One Model To Instruction-follow Them All Yixuan Su et al.
- Graph Neural Prompting With Large Language Models Yijun Tian et al.
- Efficient And Effective Text Encoding For Chinese Llama And Alpaca Yiming Cui, Ziqing Yang, Xin Yao
- Summary Of Chatgpt-related Research And Perspective Towards The Future Of Large Language Models Yiheng Liu et al.
- Making Large Language Models Perform Better In Knowledge Graph Completion Yichi Zhang et al.
- INSTRUCTEVAL: Towards Holistic Evaluation Of Instruction-tuned Large Language Models Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria
- Prompting Large Language Models With Speech Recognition Abilities Yassir Fathullah et al.
- Adaptive Machine Translation With Large Language Models Yasmin Moslem, Rejwanul Haque, John D. Kelleher, Andy Way
- Embodiedgpt: Vision-language Pre-training Via Embodied Chain Of Thought Yao Mu et al.
- Alpacafarm: A Simulation Framework For Methods That Learn From Human Feedback Yann Dubois et al.
- Bubogpt: Enabling Visual Grounding In Multi-modal Llms Yang Zhao et al.
- Key-locked Rank One Editing For Text-to-image Personalization Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon
- Improving Large Language Models For Clinical Named Entity Recognition Via Prompt Engineering Yan Hu et al.
- Emotional Intelligence Of Large Language Models Xuena Wang, Xueting Li, Zi Yin, Yue Wu, Liu Jia
- Fine-tuning Llama For Multi-stage Text Retrieval Xueguang Ma, Liang Wang, Nan Yang, Furu Wei, Jimmy Lin
- Xuanyuan 2.0: A Large Chinese Financial Chat Model With Hundreds Of Billions Parameters Xuanyu Zhang, Qing Yang, Dongliang Xu
- Character-llm: A Trainable Agent For Role-playing Yunfan Shao, Linyang Li, Junqi Dai, Xipeng Qiu
- An Empirical Study Of Catastrophic Forgetting In Large Language Models During Continual Fine-tuning Yun Luo et al.
- Towards Open-world Recommendation With Knowledge Augmentation From Large Language Models Yunjia Xi et al.
- Toolllm: Facilitating Large Language Models To Master 16000+ Real-world Apis Yujia Qin et al.
- Exploring The Impact Of Instruction Data Scaling On Large Language Models: An Empirical Study On Real-world Use Cases Yunjie Ji et al.
- Educhat: A Large-scale Language Model-based Chatbot System For Intelligent Education Yuhao Dan et al.
- Aligning Large Language Models With Human: A Survey Yufei Wang et al.
- Large Language Model As Attributed Training Data Generator: A Tale Of Diversity And Bias Yue Yu et al.
- Toolqa: A Dataset For LLM Question Answering With External Tools Yuchen Zhuang, Yue Yu, Kuan Wang, Haotian Sun, Chao Zhang
- Low-rank Adaptation Of Large Language Model Rescoring For Parameter-efficient Speech Recognition Yu Yu et al.
- NL2TL: Transforming Natural Languages To Temporal Logics Using Large Language Models Yongchao Chen, Rujul Gandhi, Yang Zhang, Chuchu Fan
- Assessing Cross-cultural Alignment Between Chatgpt And Human Societies: An Empirical Study Yong Cao et al.
- Minillm: Knowledge Distillation Of Large Language Models Yuxian Gu, Li Dong, Furu Wei, Minlie Huang
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- Guiding Pretraining In Reinforcement Learning With Large Language Models Yuqing Du et al.
- Chatdoctor: A Medical Chat Model Fine-tuned On A Large Language Model Meta-ai (llama) Using Medical Domain Knowledge Yunxiang Li et al.
- MEDITRON-70B: Scaling Medical Pretraining For Large Language Models Zeming Chen et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Guiding Large Language Models Via Directional Stimulus Prompting Zekun Li et al.
- Llm-adapters: An Adapter Family For Parameter-efficient Fine-tuning Of Large Language Models Zhiqiang Hu et al.
- Ghost In The Minecraft: Generally Capable Agents For Open-world Environments Via Large Language Models With Text-based Knowledge And Memory Xizhou Zhu et al.
- Llm-pruner: On The Structural Pruning Of Large Language Models Xinyin Ma, Gongfan Fang, Xinchao Wang
- Mental-llm: Leveraging Large Language Models For Mental Health Prediction Via Online Text Data Xuhai Xu et al.
- Recommender Systems In The Era Of Large Language Models (llms) Zihuai Zhao et al.
- R2gengpt: Radiology Report Generation With Frozen Llms Zhanyu Wang, Lingqiao Liu, Lei Wang, Luping Zhou
- Large Language Models As Zero-shot Conversational Recommenders Zhankui He et al.
- A Survey On RAG Meeting Llms: Towards Retrieval-augmented Large Language Models Wenqi Fan et al.
- Billm: Pushing The Limit Of Post-training Quantization For Llms Wei Huang et al.
- The Ultimate Guide To Fine-tuning Llms From Basics To Breakthroughs: An Exhaustive Review Of Technologies, Research, Best Practices, Applied Research Challenges And Opportunities Venkatesh Balavadhani Parthasarathy, Ahtsham Zafar, Aafaq Khan, Arsalan Shahid
- Continual Learning For Large Language Models: A Survey Tongtong Wu et al.
- Chatglm: A Family Of Large Language Models From GLM-130B To GLM-4 All Tools Team Glm et al.
- Adaptmllm: Fine-tuning Multilingual Language Models On Low-resource Languages With Integrated LLM Playgrounds Séamus Lankford, Haithem Afli, Andy Way
- Chatgpt As Research Scientist: Probing Gpt's Capabilities As A Research Librarian, Research Ethicist, Data Generator And Data Predictor Steven A. Lehr, Aylin Caliskan, Suneragiri Liyanage, Mahzarin R. Banaji
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Eyes Wide Shut? Exploring The Visual Shortcomings Of Multimodal Llms Shengbang Tong et al.
- Large Language Models Meet Collaborative Filtering: An Efficient All-round Llm-based Recommender System Sein Kim et al.
- A Comprehensive Survey Of Hallucination Mitigation Techniques In Large Language Models S. M Towhidul Islam Tonmoy et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Fine-tuned Language Models Generate Stable Inorganic Materials As Text Nate Gruver et al.
- Findings Of The Second Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Michael Y. Hu et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Data Is All You Need: Finetuning Llms For Chip Design Via An Automated Design-data Augmentation Framework Kaiyan Chang et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- ORPO: Monolithic Preference Optimization Without Reference Model Jiwoo Hong, Noah Lee, James Thorne
- Openmedlm: Prompt Engineering Can Out-perform Fine-tuning In Medical Question-answering With Open-source Large Language Models Jenish Maharjan et al.
- Closing The Gap Between Open-source And Commercial Large Language Models For Medical Evidence Summarization Gongbo Zhang et al.
- Embedding Large Language Models Into Extended Reality: Opportunities And Challenges For Inclusion, Engagement, And Privacy Efe Bozkir et al.
- Olmo: Accelerating The Science Of Language Models Dirk Groeneveld et al.
- The Revolution Of Multimodal Large Language Models: A Survey Davide Caffagni et al.
- Deepseek-v2: A Strong, Economical, And Efficient Mixture-of-experts Language Model Deepseek-ai et al.
- Moe-llava: Mixture Of Experts For Large Vision-language Models Bin Lin et al.
- Understanding Large-language Model (llm)-powered Human-robot Interaction Callie Y. Kim, Christine P. Lee, Bilge Mutlu
- Why And When Llm-based Assistants Can Go Wrong: Investigating The Effectiveness Of Prompt-based Interactions For Software Help-seeking Anjali Khurana, Hari Subramonyam, Parmit K Chilana
- RAG Vs Fine-tuning: Pipelines, Tradeoffs, And A Case Study On Agriculture Angels Balaguer et al.
- AI And Memory Wall Amir Gholami et al.
- Financial Statement Analysis With Large Language Models Alex Kim, Maximilian Muhn, Valeri Nikolaev
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
- Unist: A Prompt-empowered Universal Model For Urban Spatio-temporal Prediction Yuan Yuan, Jingtao Ding, Jie Feng, Depeng Jin, Yong Li
- Understanding Llms: A Comprehensive Overview From Training To Inference Yiheng Liu et al.
- Llamafactory: Unified Efficient Fine-tuning Of 100+ Language Models Yaowei Zheng et al.
- Datasets For Large Language Models: A Comprehensive Survey Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin
- Mgte: Generalized Long-context Text Representation And Reranking Models For Multilingual Text Retrieval Xin Zhang et al.
- Data-efficient Fine-tuning For Llm-based Recommendation Xinyu Lin et al.
- Harnessing Large Language Models For Text-rich Sequential Recommendation Zhi Zheng, Wenshuo Chao, Zhaopeng Qiu, Hengshu Zhu, Hui Xiong
- Large Language Models For Data Annotation And Synthesis: A Survey Zhen Tan et al.
- Promptkd: Unsupervised Prompt Distillation For Vision-language Models Zheng Li et al.
- Llmparser: An Exploratory Study On Using Large Language Models For Log Parsing Zeyang Ma, An Ran Chen, Dong Jae Kim, Tse-hsun Chen, Shaowei Wang
- Does Fine-tuning Llms On New Knowledge Encourage Hallucinations? Zorik Gekhman et al.
- Deepseek-r1: Incentivizing Reasoning Capability In Llms Via Reinforcement Learning Deepseek-ai et al.
- Findings Of The Babylm Challenge: Sample-efficient Pretraining On Developmentally Plausible Corpora Alex Warstadt et al.
🏷 Transformer
- Topic Aware Neural Response Generation Chen Xing et al.
- Gated-attention Architectures For Task-oriented Language Grounding Devendra Singh Chaplot, Kanthashree Mysore Sathyendra, Rama Kumar Pasumarthi, Dheeraj Rajagopal, Ruslan Salakhutdinov
- Attention Is All You Need Ashish Vaswani et al.
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Frustratingly Short Attention Spans In Neural Language Modeling Michał Daniluk, Tim Rocktäschel, Johannes Welbl, Sebastian Riedel
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- Multi-cast Attention Networks For Retrieval-based Question Answering And Response Prediction Yi Tay, Luu Anh Tuan, Siu Cheung Hui
- Improving The Transformer Translation Model With Document-level Context Jiacheng Zhang et al.
- Training Tips For The Transformer Model Martin Popel, Ondřej Bojar
- Sequence-to-sequence Learning For Task-oriented Dialogue With Dialogue State Representation Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin, Ting Liu
- Seq2rdf: An End-to-end Application For Deriving Triples From Natural Language Text Yue Liu, Tongtao Zhang, Zhicheng Liang, Heng Ji, Deborah L. Mcguinness
- An Affect-rich Neural Conversational Model With Biased Attention And Weighted Cross-entropy Loss Peixiang Zhong, Di Wang, Chunyan Miao
- Character-level Language Modeling With Deeper Self-attention Rami Al-rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
- Sdnet: Contextualized Attention-based Deep Network For Conversational Question Answering Chenguang Zhu, Michael Zeng, Xuedong Huang
- Multilingual Constituency Parsing With Self-attention And Pre-training Nikita Kitaev, Steven Cao, Dan Klein
- Commonsense For Generative Multi-hop Question Answering Tasks Lisa Bauer, Yicheng Wang, Mohit Bansal
- Pervasive Attention: 2D Convolutional Neural Networks For Sequence-to-sequence Prediction Maha Elbayad, Laurent Besacier, Jakob Verbeek
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Hierarchical Neural Story Generation Angela Fan, Mike Lewis, Yann Dauphin
- BERT: Pre-training Of Deep Bidirectional Transformers For Language Understanding Jacob Devlin, Ming-wei Chang, Kenton Lee, Kristina Toutanova
- Language Modeling With Deep Transformers Kazuki Irie, Albert Zeyer, Ralf Schlüter, Hermann Ney
- Non-autoregressive Transformer By Position Learning Yu Bao et al.
- Unified Vision-language Pre-training For Image Captioning And VQA Luowei Zhou et al.
- Controlling The Output Length Of Neural Machine Translation Surafel Melaku Lakew, Mattia Di Gangi, Marcello Federico
- Visualbert: A Simple And Performant Baseline For Vision And Language Liunian Harold Li, Mark Yatskar, Da Yin, Cho-jui Hsieh, Kai-wei Chang
- Fully Quantized Transformer For Machine Translation Gabriele Prato, Ella Charlaix, Mehdi Rezagholizadeh
- MKD: A Multi-task Knowledge Distillation Approach For Pretrained Language Models Linqing Liu, Huan Wang, Jimmy Lin, Richard Socher, Caiming Xiong
- Efficient Adaptation Of Pretrained Transformers For Abstractive Summarization Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi
- Multimodal Attention Networks For Low-level Vision-and-language Navigation Federico Landi, Lorenzo Baraldi, Marcella Cornia, Massimiliano Corsini, Rita Cucchiara
- Bert4rec: Sequential Recommendation With Bidirectional Encoder Representations From Transformer Fei Sun et al.
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Pretrained Language Models For Sequential Sentence Classification Arman Cohan, Iz Beltagy, Daniel King, Bhavana Dalvi, Daniel S. Weld
- Incremental Transformer With Deliberation Decoder For Document Grounded Conversations Zekang Li et al.
- Unified Language Model Pre-training For Natural Language Understanding And Generation Li Dong et al.
- Recosa: Detecting The Relevant Contexts With Self-attention For Multi-turn Dialogue Generation Hainan Zhang, Yanyan Lan, Liang Pang, Jiafeng Guo, Xueqi Cheng
- Revealing The Dark Secrets Of BERT Olga Kovaleva, Alexey Romanov, Anna Rogers, Anna Rumshisky
- Interpreting And Improving Natural-language Processing (in Machines) With Natural Language-processing (in The Brain) Mariya Toneva, Leila Wehbe
- Reducing Transformer Depth On Demand With Structured Dropout Angela Fan, Edouard Grave, Armand Joulin
- Adapting And Evaluating A Deep Learning Language Model For Clinical Why-question Answering Andrew Wen, Mohamed Y. Elwazir, Sungrim Moon, Jungwei Fan
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- Augmenting Self-attention With Persistent Memory Sainbayar Sukhbaatar, Edouard Grave, Guillaume Lample, Herve Jegou, Armand Joulin
- Entity-consistent End-to-end Task-oriented Dialogue System With KB Retriever Libo Qin et al.
- LXMERT: Learning Cross-modality Encoder Representations From Transformers Hao Tan, Mohit Bansal
- Neural Assistant: Joint Action Prediction, Response Generation, And Latent Knowledge Reasoning Arvind Neelakantan et al.
- BERT For Joint Intent Classification And Slot Filling Qian Chen, Zhu Zhuo, Wen Wang
- Camembert: A Tasty French Language Model Louis Martin et al.
- TANDA: Transfer And Adapt Pre-trained Transformer Models For Answer Sentence Selection Siddhant Garg, Thuy Vu, Alessandro Moschitti
- Plug And Play Language Models: A Simple Approach To Controlled Text Generation Sumanth Dathathri et al.
- Cloze-driven Pretraining Of Self-attention Networks Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
- Unsupervised Cross-lingual Representation Learning At Scale Alexis Conneau et al.
- Dialogpt: Large-scale Generative Pre-training For Conversational Response Generation Yizhe Zhang et al.
- Sample Efficient Text Summarization Using A Single Pre-trained Transformer Urvashi Khandelwal, Kevin Clark, Dan Jurafsky, Lukasz Kaiser
- Unicoder-vl: A Universal Encoder For Vision And Language By Cross-modal Pre-training Gen Li et al.
- Context-aware Learning For Neural Machine Translation Sébastien Jean, Kyunghyun Cho
- Are Sixteen Heads Really Better Than One? Paul Michel, Omer Levy, Graham Neubig
- What Would Elsa Do? Freezing Layers During Transformer Fine-tuning Jaejun Lee, Raphael Tang, Jimmy Lin
- PLATO: Pre-trained Dialogue Generation Model With Discrete Latent Variable Siqi Bao, Huang He, Fan Wang, Hua Wu, Haifeng Wang
- Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection Guangxiang Zhao et al.
- Structured Pruning Of A Bert-based Question Answering Model J. S. Mccarley, Rishav Chakravarti, Avirup Sil
- Multimodal Transformer Networks For End-to-end Video-grounded Dialogue Systems Hung Le, Doyen Sahoo, Nancy F. Chen, Steven C. H. Hoi
- MUSE: Parallel Multi-scale Attention For Sequence To Sequence Learning Guangxiang Zhao, Xu Sun, Jingjing Xu, Zhiyuan Zhang, Liangchen Luo
- Learning And Evaluating Contextual Embedding Of Source Code Aditya Kanade, Petros Maniatis, Gogul Balakrishnan, Kensen Shi
- The Second Conversational Intelligence Challenge (convai2) Emily Dinan et al.
- Contextualized Sparse Representations For Real-time Open-domain Question Answering Jinhyuk Lee, Minjoon Seo, Hannaneh Hajishirzi, Jaewoo Kang
- Encode, Tag, Realize: High-precision Text Editing Eric Malmi, Sebastian Krause, Sascha Rothe, Daniil Mirylenka, Aliaksei Severyn
- Semantically Conditioned Dialog Response Generation Via Hierarchical Disentangled Self-attention Wenhu Chen, Jianshu Chen, Pengda Qin, Xifeng Yan, William Yang Wang
- Transfertransfo: A Transfer Learning Approach For Neural Network Based Conversational Agents Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
- How Does BERT Answer Questions? A Layer-wise Analysis Of Transformer Representations Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Dialogue Transformers Vladimir Vlasov, Johannes E. M. Mosig, Alan Nichol
- Exbert: A Visual Analysis Tool To Explore Learned Representations In Transformers Models Benjamin Hoover, Hendrik Strobelt, Sebastian Gehrmann
- Repurposing Entailment For Multi-hop Question Answering Tasks Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian
- PEGASUS: Pre-training With Extracted Gap-sentences For Abstractive Summarization Jingqing Zhang, Yao Zhao, Mohammad Saleh, Peter J. Liu
- Synchronous Bidirectional Inference For Neural Sequence Generation Jiajun Zhang, Long Zhou, Yang Zhao, Chengqing Zong
- VL-BERT: Pre-training Of Generic Visual-linguistic Representations Weijie Su et al.
- Bp-transformer: Modelling Long-range Context Via Binary Partitioning Zihao Ye, Qipeng Guo, Quan Gan, Xipeng Qiu, Zheng Zhang
- Fast Transformer Decoding: One Write-head Is All You Need Noam Shazeer
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- Learning To Deceive With Attention-based Explanations Danish Pruthi, Mansi Gupta, Bhuwan Dhingra, Graham Neubig, Zachary C. Lipton
- Stabilizing Transformers For Reinforcement Learning Emilio Parisotto et al.
- Megatron-lm: Training Multi-billion Parameter Language Models Using Model Parallelism Mohammad Shoeybi et al.
- An Effective Domain Adaptive Post-training Method For BERT In Response Selection Taesun Whang et al.
- Q8BERT: Quantized 8bit BERT Ofir Zafrir, Guy Boudoukh, Peter Izsak, Moshe Wasserblat
- Deep Learning Based Chatbot Models Richard Csaky
- Understanding The Behaviors Of BERT In Ranking Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
- Self-attentive Model For Headline Generation Daniil Gavrilov, Pavel Kalaidin, Valentin Malykh
- Improving Transformer Models By Reordering Their Sublayers Ofir Press, Noah A. Smith, Omer Levy
- CTRL: A Conditional Transformer Language Model For Controllable Generation Nitish Shirish Keskar, Bryan Mccann, Lav R. Varshney, Caiming Xiong, Richard Socher
- Levenshtein Transformer Jiatao Gu, Changhan Wang, Jake Zhao
- Attention Is Not Explanation Sarthak Jain, Byron C. Wallace
- Distilling Knowledge Learned In BERT For Text Generation Yen-chun Chen, Zhe Gan, Yu Cheng, Jingzhou Liu, Jingjing Liu
- Blockwise Self-attention For Long Document Understanding Jiezhong Qiu et al.
- Learning To Answer By Learning To Ask: Getting The Best Of GPT-2 And BERT Worlds Tassilo Klein, Moin Nabi
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Tree Transformer: Integrating Tree Structures Into Self-attention Yau-shian Wang, Hung-yi Lee, Yun-nung Chen
- Attentive History Selection For Conversational Question Answering Chen Qu et al.
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Freelb: Enhanced Adversarial Training For Natural Language Understanding Chen Zhu et al.
- Modeling Recurrence For Transformer Jie Hao et al.
- Sg-net: Syntax-guided Machine Reading Comprehension Zhuosheng Zhang et al.
- Tinybert: Distilling BERT For Natural Language Understanding Xiaoqi Jiao et al.
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Do Attention Heads In BERT Track Syntactic Dependencies? Phu Mon Htut, Jason Phang, Shikha Bordia, Samuel R. Bowman
- Convert: Efficient And Accurate Conversational Representations From Transformers Matthew Henderson et al.
- Story Ending Prediction By Transferable BERT Zhongyang Li, Xiao Ding, Ting Liu
- Leveraging Pre-trained Checkpoints For Sequence Generation Tasks Sascha Rothe, Shashi Narayan, Aliaksei Severyn
- Compressive Transformers For Long-range Sequence Modelling Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
- Adding Interpretable Attention To Neural Translation Models Improves Word Alignment Thomas Zenkel, Joern Wuebker, John Denero
- Fusion Of Detected Objects In Text For Visual Question Answering Chris Alberti, Jeffrey Ling, Michael Collins, David Reitter
- Lakhnes: Improving Multi-instrumental Music Generation With Cross-domain Pre-training Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian Mcauley
- Do Neural Dialog Systems Use The Conversation History Effectively? An Empirical Study Chinnadhurai Sankar, Sandeep Subramanian, Christopher Pal, Sarath Chandar, Yoshua Bengio
- Modeling Graph Structure In Transformer For Better Amr-to-text Generation Jie Zhu et al.
- Span Selection Pre-training For Question Answering Michael Glass et al.
- Bridging The Gap For Tokenizer-free Language Models Dokook Choe, Rami Al-rfou, Mandy Guo, Heeyoung Lee, Noah Constant
- Linguistic Knowledge And Transferability Of Contextual Representations Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
- Learning To Few-shot Learn Across Diverse Natural Language Classification Tasks Trapit Bansal, Rishikesh Jha, Andrew Mccallum
- Align, Mask And Select: A Simple Method For Incorporating Commonsense Knowledge Into Language Representation Models Zhi-xiu Ye, Qian Chen, Wen Wang, Zhen-hua Ling
- Visualizing Attention In Transformer-based Language Representation Models Jesse Vig
- The Bottom-up Evolution Of Representations In The Transformer: A Study With Machine Translation And Language Modeling Objectives Elena Voita, Rico Sennrich, Ivan Titov
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Encoder-agnostic Adaptation For Conditional Language Generation Zachary M. Ziegler, Luke Melas-kyriazi, Sebastian Gehrmann, Alexander M. Rush
- A Multiscale Visualization Of Attention In The Transformer Model Jesse Vig
- Text Summarization With Pretrained Encoders Yang Liu, Mirella Lapata
- Adaptive Attention Span In Transformers Sainbayar Sukhbaatar, Edouard Grave, Piotr Bojanowski, Armand Joulin
- Insertion-based Decoding With Automatically Inferred Generation Order Jiatao Gu, Qi Liu, Kyunghyun Cho
- Scheduled Sampling For Transformers Tsvetomila Mihaylova, André F. T. Martins
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Parameter-efficient Transfer Learning For NLP Neil Houlsby et al.
- Exploring The Limits Of Transfer Learning With A Unified Text-to-text Transformer Colin Raffel et al.
- Transformer-xl: Attentive Language Models Beyond A Fixed-length Context Zihang Dai et al.
- Text Infilling Wanrong Zhu, Zhiting Hu, Eric Xing
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Open-retrieval Conversational Question Answering Chen Qu et al.
- Modifying Memories In Transformer Models Chen Zhu et al.
- MART: Memory-augmented Recurrent Transformer For Coherent Video Paragraph Captioning Jie Lei et al.
- VD-BERT: A Unified Vision And Dialog Transformer With BERT Yue Wang et al.
- Deformer: Decomposing Pre-trained Transformers For Faster Question Answering Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian
- As Good As New. How To Successfully Recycle English GPT-2 To Make Models For Other Languages Wietse De Vries, Malvina Nissim
- Delight: Deep And Light-weight Transformer Sachin Mehta, Marjan Ghazvininejad, Srinivasan Iyer, Luke Zettlemoyer, Hannaneh Hajishirzi
- DIET: Lightweight Language Understanding For Dialogue Systems Tanja Bunk, Daksh Varshneya, Vladimir Vlasov, Alan Nichol
- Pre-trained Summarization Distillation Sam Shleifer, Alexander M. Rush
- Logical Natural Language Generation From Open-domain Tables Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang
- Sequential Latent Knowledge Selection For Knowledge-grounded Dialogue Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
- Long Range Arena: A Benchmark For Efficient Transformers Yi Tay et al.
- Synthesizer: Rethinking Self-attention In Transformer Models Yi Tay et al.
- EDITOR: An Edit-based Transformer With Repositioning For Neural Machine Translation With Soft Lexical Constraints Weijia Xu, Marine Carpuat
- A Recurrent Vision-and-language BERT For Navigation Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-opazo, Stephen Gould
- Data Augmentation Using Pre-trained Transformer Models Varun Kumar, Ashutosh Choudhary, Eunah Cho
- Pretrained Transformers For Simple Question Answering Over Knowledge Graphs D. Lukovnikov, A. Fischer, J. Lehmann
- KVL-BERT: Knowledge Enhanced Visual-and-linguistic BERT For Visual Commonsense Reasoning Dandan Song, Siyi Ma, Zhanchen Sun, Sicheng Yang, Lejian Liao
- Pretrained Transformers Improve Out-of-distribution Robustness Dan Hendrycks et al.
- Unqovering Stereotyping Biases Via Underspecified Questions Tao Li, Tushar Khot, Daniel Khashabi, Ashish Sabharwal, Vivek Srikumar
- When BERT Plays The Lottery, All Tickets Are Winning Sai Prasanna, Anna Rogers, Anna Rumshisky
- The Chess Transformer: Mastering Play Using Generative Language Models David Noever, Matt Ciolino, Josh Kalin
- Rikinet: Reading Wikipedia Pages For Natural Question Answering Dayiheng Liu et al.
- Coda: Contrast-enhanced And Diversity-promoting Data Augmentation For Natural Language Understanding Yanru Qu et al.
- KRISP: Integrating Implicit And Symbolic Knowledge For Open-domain Knowledge-based VQA Kenneth Marino, Xinlei Chen, Devi Parikh, Abhinav Gupta, Marcus Rohrbach
- Conversational Question Reformulation Via Sequence-to-sequence Architectures And Pretrained Language Models Sheng-chieh Lin et al.
- Knowledge-aware Language Model Pretraining Corby Rosset et al.
- Big Bird: Transformers For Longer Sequences Manzil Zaheer et al.
- Pymt5: Multi-mode Translation Of Natural Language And Python Code With Transformers Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan
- How Effective Is Task-agnostic Data Augmentation For Pretrained Transformers? Shayne Longpre, Yu Wang, Christopher Dubois
- Russiansuperglue: A Russian Language Understanding Evaluation Benchmark Tatiana Shavrina et al.
- SEAL: Segment-wise Extractive-abstractive Long-form Text Summarization Yao Zhao, Mohammad Saleh, Peter J. Liu
- Measuring Systematic Generalization In Neural Proof Generation With Transformers Nicolas Gontier, Koustuv Sinha, Siva Reddy, Christopher Pal
- Variational Transformers For Diverse Response Generation Zhaojiang Lin, Genta Indra Winata, Peng Xu, Zihan Liu, Pascale Fung
- Robust Conversational AI With Grounded Text Generation Jianfeng Gao et al.
- Earlybert: Efficient BERT Training Via Early-bird Lottery Tickets Xiaohan Chen et al.
- Pre-training Text-to-text Transformers For Concept-centric Common Sense Wangchunshu Zhou et al.
- Non-autoregressive Machine Translation With Latent Alignments Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi
- Hard-coded Gaussian Attention For Neural Machine Translation Weiqiu You, Simeng Sun, Mohit Iyyer
- Gshard: Scaling Giant Models With Conditional Computation And Automatic Sharding Dmitry Lepikhin et al.
- A Simple But Tough-to-beat Data Augmentation Approach For Natural Language Understanding And Generation Dinghan Shen, Mingzhi Zheng, Yelong Shen, Yanru Qu, Weizhu Chen
- Deebert: Dynamic Early Exiting For Accelerating BERT Inference Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, Jimmy Lin
- Dialoguetrm: Exploring The Intra- And Inter-modal Emotional Behaviors In The Conversation Yuzhao Mao et al.
- Accelerating Training Of Transformer-based Language Models With Progressive Layer Dropping Minjia Zhang, Yuxiong He
- Mapping Natural Language Instructions To Mobile UI Action Sequences Yang Li, Jiacong He, Xin Zhou, Yuan Zhang, Jason Baldridge
- Ternarybert: Distillation-aware Ultra-low Bit BERT Wei Zhang et al.
- Train Large, Then Compress: Rethinking Model Size For Efficient Training And Inference Of Transformers Zhuohan Li et al.
- Aragpt2: Pre-trained Transformer For Arabic Language Generation Wissam Antoun, Fady Baly, Hazem Hajj
- Layoutlmv2: Multi-modal Pre-training For Visually-rich Document Understanding Yang Xu et al.
- Sequence-level Mixed Sample Data Augmentation Demi Guo, Yoon Kim, Alexander M. Rush
- SPARTA: Efficient Open-domain Question Answering Via Sparse Transformer Matching Retrieval Tiancheng Zhao, Xiaopeng Lu, Kyusong Lee
- Improving Natural Language Processing Tasks With Human Gaze-guided Neural Attention Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
- On Optimal Transformer Depth For Low-resource Language Translation Elan Van Biljon, Arnu Pretorius, Julia Kreutzer
- The Cascade Transformer: An Application For Efficient Answer Sentence Selection Luca Soldaini, Alessandro Moschitti
- Efficient Transformer-based Large Scale Language Representations Using Hardware-friendly Block Structured Pruning Bingbing Li et al.
- TIME: Text And Image Mutual-translation Adversarial Networks Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard De Melo, Ahmed Elgammal
- Bert-hlstms: BERT And Hierarchical Lstms For Visual Storytelling Jing Su, Qingyun Dai, Frank Guerin, Mian Zhou
- PALM: Pre-training An Autoencoding&autoregressive Language Model For Context-conditioned Generation Bin Bi et al.
- Visbert: Hidden-state Visualizations For Transformers Betty Van Aken, Benjamin Winter, Alexander Löser, Felix A. Gers
- Training Large Neural Networks With Constant Memory Using A New Execution Algorithm Bharadwaj Pudipeddi, Maral Mesmakhosroshahi, Jinwen Xi, Sujeeth Bharadwaj
- Rethinking The Value Of Transformer Components Wenxuan Wang, Zhaopeng Tu
- When Do You Need Billions Of Words Of Pretraining Data? Yian Zhang, Alex Warstadt, Haau-sing Li, Samuel R. Bowman
- SOLOIST: Building Task Bots At Scale With Transfer Learning And Machine Teaching Baolin Peng et al.
- Funnel-transformer: Filtering Out Sequential Redundancy For Efficient Language Processing Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le
- Query Resolution For Conversational Search With Limited Supervision Nikos Voskarides, Dan Li, Pengjie Ren, Evangelos Kanoulas, Maarten De Rijke
- Encoding Syntactic Knowledge In Transformer Encoder For Intent Detection And Slot Filling Jixuan Wang, Kai Wei, Martin Radfar, Weiwei Zhang, Clement Chung
- Prophetnet: Predicting Future N-gram For Sequence-to-sequence Pre-training Weizhen Qi et al.
- Colake: Contextualized Language And Knowledge Embedding Tianxiang Sun et al.
- Improving Vision-and-language Navigation With Image-text Pairs From The Web Arjun Majumdar et al.
- Behind The Scene: Revealing The Secrets Of Pre-trained Vision-and-language Models Jize Cao et al.
- Just Ask: Learning To Answer Questions From Millions Of Narrated Videos Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid
- Compressing Large-scale Transformer-based Models: A Case Study On BERT Prakhar Ganesh et al.
- From Zero To Hero: On The Limitations Of Zero-shot Cross-lingual Transfer With Multilingual Transformers Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
- GMAT: Global Memory Augmentation For Transformers Ankit Gupta, Jonathan Berant
- SPECTER: Document-level Representation Learning Using Citation-informed Transformers Arman Cohan, Sergey Feldman, Iz Beltagy, Doug Downey, Daniel S. Weld
- Proofwriter: Generating Implications, Proofs, And Abductive Statements Over Natural Language Oyvind Tafjord, Bhavana Dalvi Mishra, Peter Clark
- Edgebert: Sentence-level Energy Optimizations For Latency-aware Multi-task NLP Inference Thierry Tambe et al.
- Code Prediction By Feeding Trees To Transformers Seohyun Kim, Jinman Zhao, Yuchi Tian, Satish Chandra
- Addressing Some Limitations Of Transformers With Feedback Memory Angela Fan, Thibaut Lavril, Edouard Grave, Armand Joulin, Sainbayar Sukhbaatar
- An Empirical Investigation Of Pre-trained Transformer Language Models For Open-domain Dialogue Generation Piji Li
- Adapterhub: A Framework For Adapting Transformers Jonas Pfeiffer et al.
- Unnatural Language Inference Koustuv Sinha, Prasanna Parthasarathi, Joelle Pineau, Adina Williams
- Adapterdrop: On The Efficiency Of Adapters In Transformers Andreas Rücklé et al.
- Natural Language Rationales With Full-stack Visual Reasoning: From Pixels To Semantic Frames To Commonsense Graphs Ana Marasović et al.
- ETC: Encoding Long And Structured Inputs In Transformers Joshua Ainslie et al.
- On The Stability Of Fine-tuning BERT: Misconceptions, Explanations, And Strong Baselines Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow
- Byte Pair Encoding Is Suboptimal For Language Model Pretraining Kaj Bostrom, Greg Durrett
- Chatbot Interaction With Artificial Intelligence: Human Data Augmentation With T5 And Language Transformer Ensemble For Text Classification Jordan J. Bird, Anikó Ekárt, Diego R. Faria
- Leap-of-thought: Teaching Pre-trained Models To Systematically Reason Over Implicit Knowledge Alon Talmor, Oyvind Tafjord, Peter Clark, Yoav Goldberg, Jonathan Berant
- Cocon: A Self-supervised Approach For Controlled Text Generation Alvin Chan, Yew-soon Ong, Bill Pung, Aston Zhang, Jie Fu
- Automated Source Code Generation And Auto-completion Using Deep Learning: Comparing And Discussing Current Language-model-related Approaches Juan Cruz-benito, Sanjay Vishwakarma, Francisco Martin-fernandez, Ismael Faro
- POINTER: Constrained Progressive Text Generation Via Insertion-based Generative Pre-training Yizhe Zhang et al.
- Intellicode Compose: Code Generation Using Transformer Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan
- Transformers As Soft Reasoners Over Language Peter Clark, Oyvind Tafjord, Kyle Richardson
- Look Before You Speak: Visually Contextualized Utterances Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid
- Dialogbert: Discourse-aware Response Generation Via Learning To Recover And Rank Utterances Xiaodong Gu, Kang Min Yoo, Jung-woo Ha
- Non-autoregressive Machine Translation With Disentangled Context Transformer Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu
- Contrastive Learning With Adversarial Perturbations For Conditional Text Generation Seanie Lee, Dong Bok Lee, Sung Ju Hwang
- Auto-captions On GIF: A Large-scale Video-sentence Dataset For Vision-language Pre-training Yingwei Pan et al.
- Rapidly Bootstrapping A Question Answering Dataset For COVID-19 Raphael Tang et al.
- Minilm: Deep Self-attention Distillation For Task-agnostic Compression Of Pre-trained Transformers Wenhui Wang et al.
- Mt5: A Massively Multilingual Pre-trained Text-to-text Transformer Linting Xue et al.
- Minilmv2: Multi-head Self-attention Relation Distillation For Compressing Pretrained Transformers Wenhui Wang, Hangbo Bao, Shaohan Huang, Li Dong, Furu Wei
- Longformer: The Long-document Transformer Iz Beltagy, Matthew E. Peters, Arman Cohan
- IART: Intent-aware Response Ranking With Transformers In Information-seeking Conversation Systems Liu Yang et al.
- Codebert: A Pre-trained Model For Programming And Natural Languages Zhangyin Feng et al.
- Lightseq: A High Performance Inference Library For Transformers Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li
- DSTC8-AVSD: Multimodal Semantic Transformer Network With Retrieval Style Word Generator Hwanhee Lee et al.
- Rethinking Embedding Coupling In Pre-trained Language Models Hyung Won Chung, Thibault Févry, Henry Tsai, Melvin Johnson, Sebastian Ruder
- Calibration Of Pre-trained Transformers Shrey Desai, Greg Durrett
- Probing Pretrained Language Models For Lexical Semantics Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- How Fine Can Fine-tuning Be? Learning Efficient Language Models Evani Radiya-dixit, Xin Wang
- Document Ranking With A Pretrained Sequence-to-sequence Model Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Simplifying Paragraph-level Question Generation Via Transformer Language Models Luis Enrico Lopez, Diane Kathryn Cruz, Jan Christian Blaise Cruz, Charibeth Cheng
- Emptransfo: A Multi-head Transformer Architecture For Creating Empathetic Dialog Systems Rohola Zandie, Mohammad H. Mahoor
- Unilmv2: Pseudo-masked Language Models For Unified Language Model Pre-training Hangbo Bao et al.
- Mixup-transformer: Dynamic Data Augmentation For NLP Tasks Lichao Sun et al.
- Assessing Phrasal Representation And Composition In Transformers Lang Yu, Allyson Ettinger
- Indic-transformers: An Analysis Of Transformer Language Models For Indian Languages Kushal Jain, Adwait Deshpande, Kumar Shridhar, Felix Laumann, Ayushman Dash
- End-to-end Synthetic Data Generation For Domain Adaptation Of Question Answering Systems Siamak Shakeri et al.
- Length-adaptive Transformer: Train Once With Length Drop, Use Anytime With Search Gyuwan Kim, Kyunghyun Cho
- What Does BERT Know About Books, Movies And Music? Probing BERT For Conversational Recommendation Gustavo Penha, Claudia Hauff
- Linformer: Self-attention With Linear Complexity Sinong Wang, Belinda Z. Li, Madian Khabsa, Han Fang, Hao Ma
- Rethinking Positional Encoding In Language Pre-training Guolin Ke, Di He, Tie-yan Liu
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- Text-to-text Pre-training For Data-to-text Tasks Mihir Kale, Abhinav Rastogi
- Ernie-doc: A Retrospective Long-document Modeling Transformer Siyu Ding et al.
- Low-rank Bottleneck In Multi-head Attention Models Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar
- X-LXMERT: Paint, Caption And Answer Questions With Multi-modal Transformers Jaemin Cho, Jiasen Lu, Dustin Schwenk, Hannaneh Hajishirzi, Aniruddha Kembhavi
- Retrofitting Structure-aware Transformer Language Model For End Tasks Hao Fei, Yafeng Ren, Donghong Ji
- TRANS-BLSTM: Transformer With Bidirectional LSTM For Language Understanding Zhiheng Huang, Peng Xu, Davis Liang, Ajay Mishra, Bing Xiang
- Gpt-too: A Language-model-first Approach For Amr-to-text Generation Manuel Mager et al.
- Coregen: Contextualized Code Representation Learning For Commit Message Generation Lun Yiu Nie et al.
- On The Effect Of Dropping Layers Of Pre-trained Transformer Models Hassan Sajjad, Fahim Dalvi, Nadir Durrani, Preslav Nakov
- Turngpt: A Transformer-based Language Model For Predicting Turn-taking In Spoken Dialog Erik Ekstedt, Gabriel Skantze
- A Controllable Model Of Grounded Response Generation Zeqiu Wu et al.
- A Closer Look At The Robustness Of Vision-and-language Pre-trained Models Linjie Li, Zhe Gan, Jingjing Liu
- Mobilebert: A Compact Task-agnostic BERT For Resource-limited Devices Zhiqing Sun et al.
- Mention Memory: Incorporating Textual Knowledge Into Transformers Through Entity Mention Attention Michiel De Jong, Yury Zemlyanskiy, Nicholas Fitzgerald, Fei Sha, William Cohen
- Lightningdot: Pre-training Visual-semantic Embeddings For Real-time Image-text Retrieval Siqi Sun et al.
- Lightner: A Lightweight Tuning Paradigm For Low-resource NER Via Pluggable Prompting Xiang Chen et al.
- One Chatbot Per Person: Creating Personalized Chatbots Based On Implicit User Profiles Zhengyi Ma, Zhicheng Dou, Yutao Zhu, Hanxun Zhong, Ji-rong Wen
- The NLP Cookbook: Modern Recipes For Transformer Based Deep Learning Architectures Sushant Singh, Ausif Mahmood
- Ernie-vilg: Unified Generative Pre-training For Bidirectional Vision-language Generation Han Zhang et al.
- Vlmo: Unified Vision-language Pre-training With Mixture-of-modality-experts Hangbo Bao et al.
- E2E-VLP: End-to-end Vision-language Pre-training Enhanced By Visual Learning Haiyang Xu et al.
- Vision-and-language Or Vision-for-language? On Cross-modal Influence In Multimodal Transformers Stella Frank, Emanuele Bugliarello, Desmond Elliott
- Evaluating The Robustness Of Retrieval Pipelines With Query Variation Generators Gustavo Penha, Arthur Câmara, Claudia Hauff
- Improved Text Classification Via Contrastive Adversarial Training Lin Pan, Chung-wei Hang, Avirup Sil, Saloni Potdar
- Longt5: Efficient Text-to-text Transformer For Long Sequences Mandy Guo et al.
- G-transformer For Document-level Machine Translation Guangsheng Bao, Yue Zhang, Zhiyang Teng, Boxing Chen, Weihua Luo
- Scale Efficiently: Insights From Pre-training And Fine-tuning Transformers Yi Tay et al.
- Byt5: Towards A Token-free Future With Pre-trained Byte-to-byte Models Linting Xue et al.
- Causal Attention For Vision-language Tasks Xu Yang, Hanwang Zhang, Guojun Qi, Jianfei Cai
- Robeczech: Czech Roberta, A Monolingual Contextualized Language Representation Model Milan Straka, Jakub Náplava, Jana Straková, David Samuel
- Improving Stack Overflow Question Title Generation With Copying Enhanced Codebert Model And Bi-modal Information Fengji Zhang et al.
- Progressive Transformer-based Generation Of Radiology Reports Farhad Nooralahzadeh, Nicolas Perez Gonzalez, Thomas Frauenfelder, Koji Fujimoto, Michael Krauthammer
- GPT-3 Models Are Poor Few-shot Learners In The Biomedical Domain Milad Moradi, Kathrin Blagec, Florian Haberl, Matthias Samwald
- Scifive: A Text-to-text Transformer Model For Biomedical Literature Long N. Phan et al.
- Text-free Prosody-aware Generative Spoken Language Modeling Eugene Kharitonov et al.
- MT6: Multilingual Pretrained Text-to-text Transformer With Translation Pairs Zewen Chi et al.
- Exploring Transformers In Natural Language Generation: GPT, BERT, And Xlnet M. Onat Topal, Anil Bas, Imke Van Heerden
- Taming Sparsely Activated Transformer With Stochastic Experts Simiao Zuo et al.
- Personalized Transformer For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Code Structure Guided Transformer For Source Code Summarization Shuzheng Gao et al.
- Condenser: A Pre-training Architecture For Dense Retrieval Luyu Gao, Jamie Callan
- Advancing High-resolution Video-language Representation With Large-scale Video Transcriptions Hongwei Xue et al.
- FILIP: Fine-grained Interactive Language-image Pre-training Lewei Yao et al.
- KAT: A Knowledge Augmented Transformer For Vision-and-language Liangke Gui et al.
- Generic Attention-model Explainability For Interpreting Bi-modal And Encoder-decoder Transformers Hila Chefer, Shir Gur, Lior Wolf
- Cotext: Multi-task Learning With Code-text Transformer Long Phan et al.
- Transformer-based Conditional Variational Autoencoder For Controllable Story Generation Le Fang et al.
- Wangchanberta: Pretraining Transformer-based Thai Language Models Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai, Sarana Nutanong
- An Explanation Of In-context Learning As Implicit Bayesian Inference Sang Michael Xie, Aditi Raghunathan, Percy Liang, Tengyu Ma
- Scaling Language Models: Methods, Analysis & Insights From Training Gopher Jack W. Rae et al.
- Using Prior Knowledge To Guide Bert's Attention In Semantic Textual Matching Tasks Tingyu Xia, Yue Wang, Yuan Tian, Yi Chang
- Swinbert: End-to-end Transformers With Sparse Attention For Video Captioning Kevin Lin et al.
- Pretrained Transformers As Universal Computation Engines Kevin Lu, Aditya Grover, Pieter Abbeel, Igor Mordatch
- Conversational Question Answering Over Knowledge Graphs With Transformer And Graph Attention Networks Endri Kacupaj et al.
- Investigating The Limitations Of Transformers With Simple Arithmetic Tasks Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin
- Bitfit: Simple Parameter-efficient Fine-tuning For Transformer-based Masked Language-models Elad Ben Zaken, Shauli Ravfogel, Yoav Goldberg
- Arat5: Text-to-text Transformers For Arabic Language Generation El Moatez Billah Nagoudi, Abdelrahim Elmadany, Muhammad Abdul-mageed
- Lora: Low-rank Adaptation Of Large Language Models Edward J. Hu et al.
- Luna: Linear Unified Nested Attention Xuezhe Ma et al.
- Sequence Length Is A Domain: Length-based Overfitting In Transformer Models Dušan Variš, Ondřej Bojar
- Align And Prompt: Video-and-language Pre-training With Entity Prompts Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C. H. Hoi
- Cross-attention Is All You Need: Adapting Pretrained Transformers For Machine Translation Mozhdeh Gheini, Xiang Ren, Jonathan May
- TR-BERT: Dynamic Token Reduction For Accelerating BERT Inference Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun
- Text Compression-aided Transformer Encoding Zuchao Li et al.
- How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty In Text Generation Using RAVEN R. Thomas Mccoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
- Diagnosing Vision-and-language Navigation: What Really Matters Wanrong Zhu et al.
- Greedy-layer Pruning: Speeding Up Transformer Models For Natural Language Processing David Peer, Sebastian Stabinger, Stefan Engl, Antonio Rodriguez-sanchez
- Primer: Searching For Efficient Transformers For Language Modeling David R. So et al.
- Larger-scale Transformers For Multilingual Masked Language Modeling Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau
- Adaptive Semiparametric Language Models Dani Yogatama, Cyprien De Masson D'autume, Lingpeng Kong
- Knowledge Neurons In Pretrained Transformers Damai Dai et al.
- DYLE: Dynamic Latent Extraction For Abstractive Long-input Summarization Ziming Mao et al.
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Unifying Multimodal Transformer For Bi-directional Image And Text Generation Yupan Huang, Hongwei Xue, Bei Liu, Yutong Lu
- Fastformer: Additive Attention Can Be All You Need Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang, Xing Xie
- Multimodal Transformer With Variable-length Memory For Vision-and-language Navigation Chuang Lin et al.
- Generate, Annotate, And Learn: NLP With Synthetic Text Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi
- Climatebert: A Pretrained Language Model For Climate-related Text Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold
- MMBERT: Multimodal BERT Pretraining For Improved Medical VQA Yash Khare et al.
- Prompting Visual-language Models For Efficient Video Understanding Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang, Weidi Xie
- Codet5: Identifier-aware Unified Pre-trained Encoder-decoder Models For Code Understanding And Generation Yue Wang, Weishi Wang, Shafiq Joty, Steven C. H. Hoi
- Terapipe: Token-level Pipeline Parallelism For Training Large-scale Language Models Zhuohan Li et al.
- Understanding And Overcoming The Challenges Of Efficient Transformer Quantization Yelysei Bondarenko, Markus Nagel, Tijmen Blankevoort
- See, Hear, Read: Leveraging Multimodality With Guided Attention For Abstractive Text Summarization Yash Kumar Atri, Shraman Pramanick, Vikram Goyal, Tanmoy Chakraborty
- SGEITL: Scene Graph Enhanced Image-text Learning For Visual Commonsense Reasoning Zhecan Wang et al.
- Recent Advances In Natural Language Processing Via Large Pre-trained Language Models: A Survey Bonan Min et al.
- What Changes Can Large-scale Language Models Bring? Intensive Study On Hyperclova: Billions-scale Korean Generative Pretrained Transformers Boseop Kim et al.
- Human Parity On Commonsenseqa: Augmenting Self-attention With External Attention Yichong Xu et al.
- Hierarchical Task Learning From Language Instructions With Unified Transformers And Self-monitoring Yichi Zhang, Joyce Chai
- Prune Once For All: Sparse Pre-trained Language Models Ofir Zafrir, Ariel Larey, Guy Boudoukh, Haihao Shen, Moshe Wasserblat
- Urltran: Improving Phishing URL Detection Using Transformers Pranav Maneriker et al.
- CDLM: Cross-document Language Modeling Avi Caciularu et al.
- Towards Facilitating Empathic Conversations In Online Mental Health Support: A Reinforcement Learning Approach Ashish Sharma, Inna W. Lin, Adam S. Miner, David C. Atkins, Tim Althoff
- What Do Pre-trained Code Models Know About Code? Anjan Karmakar, Romain Robbes
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- Are Pre-trained Convolutions Better Than Pre-trained Transformers? Yi Tay et al.
- Wordcraft: A Human-ai Collaborative Editor For Story Writing Andy Coenen, Luke Davis, Daphne Ippolito, Emily Reif, Ann Yuan
- KM-BART: Knowledge Enhanced Multimodal BART For Visual Commonsense Generation Yiran Xing et al.
- Long-span Summarization Via Local Attention And Content Selection Potsawee Manakul, Mark J. F. Gales
- Episodic Transformer For Vision-and-language Navigation Alexander Pashevich, Cordelia Schmid, Chen Sun
- Embodied BERT: A Transformer Model For Embodied, Language-guided Visual Task Completion Alessandro Suglia, Qiaozi Gao, Jesse Thomason, Govind Thattai, Gaurav Sukhatme
- An Exploratory Study On Long Dialogue Summarization: What Works And What's Next Yusen Zhang et al.
- Large Pre-trained Language Models Contain Human-like Biases Of What Is Right And Wrong To Do Patrick Schramowski, Cigdem Turan, Nico Andersen, Constantin A. Rothkopf, Kristian Kersting
- Distilling Large Language Models Into Tiny And Effective Students Using Pqrnn Prabhu Kaliamoorthi, Aditya Siddhant, Edward Li, Melvin Johnson
- Quiz-style Question Generation For News Stories Adam D. Lelkes, Vinh Q. Tran, Cong Yu
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- An Empirical Study Of Training End-to-end Vision-and-language Transformers Zi-yi Dou et al.
- Robertuito: A Pre-trained Language Model For Social Media Text In Spanish Juan Manuel Pérez, Damián A. Furman, Laura Alonso Alemany, Franco Luque
- Learned Token Pruning For Transformers Sehoon Kim et al.
- I-BERT: Integer-only BERT Quantization Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
- Visualgpt: Data-efficient Adaptation Of Pretrained Language Models For Image Captioning Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
- MATE: Multi-view Attention For Table Transformer Efficiency Julian Martin Eisenschlos, Maharshi Gor, Thomas Müller, William W. Cohen
- Improving Language Models By Retrieving From Trillions Of Tokens Sebastian Borgeaud et al.
- Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots Juntao Li et al.
- Rome Was Built In 1776: A Case Study On Factual Correctness In Knowledge-grounded Response Generation Sashank Santhanam et al.
- CANINE: Pre-training An Efficient Tokenization-free Encoder For Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting
- Planning With Learned Entity Prompts For Abstractive Summarization Shashi Narayan et al.
- Learning Rich Representation Of Keyphrases From Text Mayank Kulkarni, Debanjan Mahata, Ravneet Arora, Rajarshi Bhowmik
- Show Your Work: Scratchpads For Intermediate Computation With Language Models Maxwell Nye et al.
- Sentence-t5: Scalable Sentence Encoders From Pre-trained Text-to-text Models Jianmo Ni et al.
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- UFO: A Unified Transformer For Vision-language Representation Learning Jianfeng Wang et al.
- FLAT: An Optimized Dataflow For Mitigating Attention Bottlenecks Sheng-chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna
- Fastmoe: A Fast Mixture-of-expert Training System Jiaao He et al.
- Hiddencut: Simple Data Augmentation For Natural Language Understanding With Better Generalization Jiaao Chen, Dinghan Shen, Weizhu Chen, Diyi Yang
- When Attention Meets Fast Recurrence: Training Language Models With Reduced Compute Tao Lei
- Trankit: A Light-weight Transformer-based Toolkit For Multilingual Natural Language Processing Minh Van Nguyen, Viet Dac Lai, Amir Pouran Ben Veyseh, Thien Huu Nguyen
- Explaining Documents' Relevance To Search Queries Razieh Rahimi, Youngwoo Kim, Hamed Zamani, James Allan
- AMMUS : A Survey Of Transformer-based Pretrained Models In Natural Language Processing Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
- A Comparative Study Of Transformer-based Language Models On Extractive Question Answering Kate Pearce, Tiffany Zhan, Aneesh Komanduri, Justin Zhan
- Training Verifiers To Solve Math Word Problems Karl Cobbe et al.
- Augmenting Sequential Recommendation With Pseudo-prior Items Via Reversely Pre-training Transformer Zhiwei Liu, Ziwei Fan, Yu Wang, Philip S. Yu
- A Good Prompt Is Worth Millions Of Parameters: Low-resource Prompt-based Learning For Vision-language Models Woojeong Jin, Yu Cheng, Yelong Shen, Weizhu Chen, Xiang Ren
- Normformer: Improved Transformer Pretraining With Extra Normalization Sam Shleifer, Jason Weston, Myle Ott
- Reacc: A Retrieval-augmented Code Completion Framework Shuai Lu et al.
- Training And Evaluating A Jupyter Notebook Data Science Assistant Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan
- Dall-eval: Probing The Reasoning Skills And Social Biases Of Text-to-image Generation Models Jaemin Cho, Abhay Zala, Mohit Bansal
- Retromae: Pre-training Retrieval-oriented Language Models Via Masked Auto-encoder Shitao Xiao, Zheng Liu, Yingxia Shao, Zhao Cao
- OPT: Open Pre-trained Transformer Language Models Susan Zhang et al.
- Contrastive Learning With Bidirectional Transformers For Sequential Recommendation Hanwen Du et al.
- Uni-perceiver V2: A Generalist Model For Large-scale Vision And Vision-language Tasks Hao Li et al.
- Vl-beit: Generative Vision-language Pretraining Hangbo Bao, Wenhui Wang, Li Dong, Furu Wei
- A Survey Of Controllable Text Generation Using Transformer-based Pre-trained Language Models Hanqing Zhang, Haolin Song, Shaoyu Li, Ming Zhou, Dawei Song
- Data Distributional Properties Drive Emergent In-context Learning In Transformers Stephanie C. Y. Chan et al.
- Vision-and-language Pretrained Models: A Survey Siqu Long, Feiqi Cao, Soyeon Caren Han, Haiqin Yang
- LUT-GEMM: Quantized Matrix Multiplication Based On Luts For Efficient Inference In Large-scale Generative Language Models Gunho Park et al.
- Hitskt: A Hierarchical Transformer Model For Session-aware Knowledge Tracing Fucai Ke et al.
- Ignore Previous Prompt: Attack Techniques For Language Models Fábio Perez, Ian Ribeiro
- Vl-interpret: An Interactive Visualization Tool For Interpreting Vision-language Transformers Estelle Aflalo et al.
- Flashattention: Fast And Memory-efficient Exact Attention With Io-awareness Tri Dao, Daniel Y. Fu, Stefano Ermon, Atri Rudra, Christopher Ré
- Personalized Prompt Learning For Explainable Recommendation Lei Li, Yongfeng Zhang, Li Chen
- Transformer Quality In Linear Time Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le
- Mass-editing Memory In A Transformer Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau
- Minicons: Enabling Flexible Behavioral And Representational Analyses Of Transformer Language Models Kanishka Misra
- Training Compute-optimal Large Language Models Jordan Hoffmann et al.
- Cramming: Training A Language Model On A Single GPU In One Day Jonas Geiping, Tom Goldstein
- Towards Trustworthy Autograding Of Short, Multi-lingual, Multi-type Answers Johannes Schneider, Robin Richner, Micha Riser
- RASAT: Integrating Relational Structures Into Pretrained Seq2seq Model For Text-to-sql Jiexing Qi et al.
- Improving The Domain Adaptation Of Retrieval Augmented Generation (RAG) Models For Open Domain Question Answering Shamane Siriwardhana et al.
- Unified-io: A Unified Model For Vision, Language, And Multi-modal Tasks Jiasen Lu, Christopher Clark, Rowan Zellers, Roozbeh Mottaghi, Aniruddha Kembhavi
- GIT: A Generative Image-to-text Transformer For Vision And Language Jianfeng Wang et al.
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Using Deepspeed And Megatron To Train Megatron-turing NLG 530B, A Large-scale Generative Language Model Shaden Smith et al.
- Cogvideo: Large-scale Pretraining For Text-to-video Generation Via Transformers Wenyi Hong, Ming Ding, Wendi Zheng, Xinghan Liu, Jie Tang
- Accelerating Attention Through Gradient-based Learned Runtime Pruning Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang
- Coca: Contrastive Captioners Are Image-text Foundation Models Jiahui Yu et al.
- Hybrid Transformer With Multi-level Fusion For Multimodal Knowledge Graph Completion Xiang Chen et al.
- Pali: A Jointly-scaled Multilingual Language-image Model Xi Chen et al.
- Confident Adaptive Language Modeling Tal Schuster et al.
- Scaling Autoregressive Models For Content-rich Text-to-image Generation Jiahui Yu et al.
- BERTIN: Efficient Pre-training Of A Spanish Language Model Using Perplexity Sampling Javier De La Rosa et al.
- Camel: Mean Teacher Learning For Image Captioning Manuele Barraco et al.
- Efficient Long-text Understanding With Short-text Models Maor Ivgi, Uri Shaham, Jonathan Berant
- Vit5: Pretrained Text-to-text Transformer For Vietnamese Language Generation Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Inpars: Data Augmentation For Information Retrieval Using Large Language Models Luiz Bonifacio, Hugo Abonizio, Marzieh Fadaee, Rodrigo Nogueira
- Lamda: Language Models For Dialog Applications Romal Thoppilan et al.
- What Language Model Architecture And Pretraining Objective Work Best For Zero-shot Generalization? Thomas Wang et al.
- Phenaki: Variable Length Video Generation From Open Domain Textual Description Ruben Villegas et al.
- Retrieval-augmented Multimodal Language Modeling Michihiro Yasunaga et al.
- CLIPPO: Image-and-language Understanding From Pixels Only Michael Tschannen, Basil Mustafa, Neil Houlsby
- Re2g: Retrieve, Rerank, Generate Michael Glass et al.
- Visual Prompt Tuning Menglin Jia et al.
- Zeroquant: Efficient And Affordable Post-training Quantization For Large-scale Transformers Zhewei Yao et al.
- Biogpt: Generative Pre-trained Transformer For Biomedical Text Generation And Mining Renqian Luo et al.
- An Empirical Study Of End-to-end Video-language Transformers With Masked Visual Modeling Tsu-jui Fu et al.
- Murag: Multimodal Retrieval-augmented Generator For Open Question Answering Over Images And Text Wenhu Chen, Hexiang Hu, Xi Chen, Pat Verga, William W. Cohen
- Llm.int8(): 8-bit Matrix Multiplication For Transformers At Scale Tim Dettmers, Mike Lewis, Younes Belkada, Luke Zettlemoyer
- GPTQ: Accurate Post-training Quantization For Generative Pre-trained Transformers Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh
- The Optimal BERT Surgeon: Scalable And Accurate Second-order Pruning For Large Language Models Eldar Kurtic et al.
- Hyperprompt: Prompt-based Task-conditioning Of Transformers Yun He et al.
- Layoutlmv3: Pre-training For Document AI With Unified Text And Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei
- VIMA: General Robot Manipulation With Multimodal Prompts Yunfan Jiang et al.
- Democratizing Contrastive Language-image Pre-training: A CLIP Benchmark Of Data, Model, And Supervision Yufeng Cui, Lichen Zhao, Feng Liang, Yangguang Li, Jing Shao
- Block-recurrent Transformers Delesley Hutchins, Imanol Schlag, Yuhuai Wu, Ethan Dyer, Behnam Neyshabur
- Fast Inference From Transformers Via Speculative Decoding Yaniv Leviathan, Matan Kalman, Yossi Matias
- Bytetransformer: A High-performance Transformer Boosted For Variable-length Inputs Yujia Zhai et al.
- Hungry Hungry Hippos: Towards Language Modeling With State Space Models Daniel Y. Fu et al.
- Competition-level Code Generation With Alphacode Yujia Li et al.
- Why Can GPT Learn In-context? Language Models Implicitly Perform Gradient Descent As Meta-optimizers Damai Dai et al.
- Memorizing Transformers Yuhuai Wu, Markus N. Rabe, Delesley Hutchins, Christian Szegedy
- What Do They Capture? -- A Structural Analysis Of Pre-trained Language Models For Source Code Yao Wan et al.
- Adamix: Mixture-of-adaptations For Parameter-efficient Model Tuning Yaqing Wang et al.
- Language Model Compression With Weighted Low-rank Factorization Yen-chang Hsu et al.
- Exploring Length Generalization In Large Language Models Cem Anil et al.
- An Efficient Memory-augmented Transformer For Knowledge-intensive NLP Tasks Yuxiang Wu et al.
- In-context Learning And Induction Heads Catherine Olsson et al.
- A Survey On Model Compression And Acceleration For Pretrained Language Models Canwen Xu, Julian Mcauley
- Retrieval Augmentation Of Large Language Models For Lay Language Generation Yue Guo, Wei Qiu, Gondy Leroy, Sheng Wang, Trevor Cohen
- Why Does Surprisal From Larger Transformer-based Language Models Provide A Poorer Fit To Human Reading Times? Byung-doh Oh, William Schuler
- Expanding Language-image Pretrained Models For General Video Recognition Bolin Ni et al.
- Revisiting End-to-end Speech-to-text Translation From Scratch Biao Zhang, Barry Haddow, Rico Sennrich
- BLOOM: A 176b-parameter Open-access Multilingual Language Model Bigscience Workshop et al.
- St-moe: Designing Stable And Transferable Sparse Expert Models Barret Zoph et al.
- A Survey Of Vision-language Pre-trained Models Yifan Du, Zikang Liu, Junyi Li, Wayne Xin Zhao
- Long-form Video-language Pre-training With Multimodal Temporal Contrastive Learning Yuchong Sun et al.
- Recurrent Memory Transformer Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
- Clinical-longformer And Clinical-bigbird: Transformers For Long Clinical Sequences Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- T-NER: An All-round Python Library For Transformer-based Named Entity Recognition Asahi Ushio, Jose Camacho-collados
- Reshaping Robot Trajectories Using Natural Language Commands: A Study Of Multi-modal Data Alignment Using Transformers Arthur Bucker et al.
- Super-naturalinstructions: Generalization Via Declarative Instructions On 1600+ NLP Tasks Yizhong Wang et al.
- A Model-agnostic Data Manipulation Method For Persona-based Dialogue Generation Yu Cao, Wei Bi, Meng Fang, Shuming Shi, Dacheng Tao
- A Systematic Review And Replicability Study Of Bert4rec For Sequential Recommendation Aleksandr Petrov, Craig Macdonald
- Empowering Language Models With Knowledge Graph Reasoning For Question Answering Ziniu Hu et al.
- A New Path: Scaling Vision-and-language Navigation With Synthetic Instructions And Imitation Learning Aishwarya Kamath et al.
- Storydall-e: Adapting Pretrained Text-to-image Transformers For Story Continuation Adyasha Maharana, Darryl Hannan, Mohit Bansal
- Transformer Language Models Without Positional Encodings Still Learn Positional Information Adi Haviv, Ori Ram, Ofir Press, Peter Izsak, Omer Levy
- Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models Aarohi Shammie Srivastava et al.
- TALM: Tool Augmented Language Models Aaron Parisi, Yao Zhao, Noah Fiedel
- Palm: Scaling Language Modeling With Pathways Aakanksha Chowdhery et al.
- A Length-extrapolatable Transformer Yutao Sun et al.
- Can Machines Help Us Answering Question 16 In Datasheets, And In Turn Reflecting On Inappropriate Content? Patrick Schramowski, Christopher Tauchmann, Kristian Kersting
- Make-a-scene: Scene-based Text-to-image Generation With Human Priors Oran Gafni et al.
- Educational Question Generation Of Children Storybooks Via Question Type Distribution Learning And Event-centric Summarization Zhenjie Zhao et al.
- Generative Spoken Dialogue Language Modeling Tu Anh Nguyen et al.
- Meta Policy Learning For Cold-start Conversational Recommendation Zhendong Chu, Hongning Wang, Yun Xiao, Bo Long, Lingfei Wu
- Language Models Are Realistic Tabular Data Generators Vadim Borisov, Kathrin Seßler, Tobias Leemann, Martin Pawelczyk, Gjergji Kasneci
- Few-shot Training Llms For Project-specific Code-summarization Toufique Ahmed, Premkumar Devanbu
- What Matters In Language Conditioned Robotic Imitation Learning Over Unstructured Data Oier Mees, Lukas Hermann, Wolfram Burgard
- Survey Of Hallucination In Natural Language Generation Ziwei Ji et al.
- Parallel Context Windows For Large Language Models Nir Ratner et al.
- SGPT: GPT Sentence Embeddings For Semantic Search Niklas Muennighoff
- SPACE-3: Unified Dialog Model Pre-training For Task-oriented Dialog Understanding And Generation Wanwei He et al.
- Transformer Feed-forward Layers Build Predictions By Promoting Concepts In The Vocabulary Space Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg
- Arabart: A Pretrained Arabic Sequence-to-sequence Model For Abstractive Summarization Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux, Michalis Vazirgiannis
- Hyena Hierarchy: Towards Larger Convolutional Language Models Michael Poli et al.
- Zero- And Few-shot Prompting With Llms: A Comparative Study With Fine-tuned Models For Bangla Sentiment Analysis Md. Arid Hasan et al.
- CTRAN: Cnn-transformer-based Network For Natural Language Understanding Mehrdad Rafiepour, Javad Salimi Sartakhti
- Text Matching Improves Sequential Recommendation By Reducing Popularity Biases Zhenghao Liu et al.
- Natural Language Generation And Understanding Of Big Code For Ai-assisted Programming: A Review Man Fai Wong, Shangxin Guo, Ching Nam Hang, Siu Wai Ho, Chee Wei Tan
- Parameter-efficient Fine-tuning Methods For Pretrained Language Models: A Critical Review And Assessment Lingling Xu, Haoran Xie, Si-zhao Joe Qin, Xiaohui Tao, Fu Lee Wang
- Deep Learning Mental Health Dialogue System Lennart Brocki, George C. Dyer, Anna Gładka, Neo Christopher Chung
- Sentimentgpt: Exploiting GPT For Advanced Sentiment Analysis And Its Departure From Current Machine Learning Kiana Kheiri, Hamid Karimi
- A Survey Of GPT-3 Family Large Language Models Including Chatgpt And GPT-4 Katikapalli Subramanyam Kalyan
- BLIP-2: Bootstrapping Language-image Pre-training With Frozen Image Encoders And Large Language Models Junnan Li, Dongxu Li, Silvio Savarese, Steven Hoi
- GQA: Training Generalized Multi-query Transformer Models From Multi-head Checkpoints Joshua Ainslie et al.
- Missrec: Pre-training And Transferring Multi-modal Interest-aware Sequence Representation For Recommendation Jinpeng Wang et al.
- Longnet: Scaling Transformers To 1,000,000,000 Tokens Jiayu Ding et al.
- LLM Lies: Hallucinations Are Not Bugs, But Features As Adversarial Examples Jia-yu Yao et al.
- Unlearn What You Want To Forget: Efficient Unlearning For Llms Jiaao Chen, Diyi Yang
- Learning To Compress Prompts With Gist Tokens Jesse Mu, Xiang Lisa Li, Noah Goodman
- Graphix-t5: Mixing Pre-trained Transformers With Graph-aware Layers For Text-to-sql Parsing Jinyang Li et al.
- Large Language Models (GPT) Struggle To Answer Multiple-choice Questions About Code Jaromir Savelka, Arav Agarwal, Christopher Bogart, Majd Sakr
- Chatgpt: Jack Of All Trades, Master Of None Jan Kocoń et al.
- Simple And Controllable Music Generation Jade Copet et al.
- Evaluation Of Chatgpt On Biomedical Tasks: A Zero-shot Comparison With Fine-tuned Generative Transformers Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng, Jimmy Huang
- Muse: Text-to-image Generation Via Masked Generative Transformers Huiwen Chang et al.
- Ip-adapter: Text Compatible Image Prompt Adapter For Text-to-image Diffusion Models Hu Ye, Jun Zhang, Sibo Liu, Xiao Han, Wei Yang
- Chatgpt For Shaping The Future Of Dentistry: The Potential Of Multi-modal Large Language Model Hanyao Huang et al.
- Explainability For Large Language Models: A Survey Haiyan Zhao et al.
- GPT-4 Enhanced Multimodal Grounding For Autonomous Driving: Leveraging Cross-modal Attention With Large Language Models Haicheng Liao et al.
- Extending Context Window Of Large Language Models Via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian
- Recommender Systems With Generative Retrieval Shashank Rajput et al.
- Decoding Chatgpt: A Taxonomy Of Existing Research, Current Challenges, And Possible Future Directions Shahab Saquib Sohail et al.
- Seamless: Multilingual Expressive And Streaming Speech Translation Seamless Communication et al.
- Chatgpt Or Human? Detect And Explain. Explaining Decisions Of Machine Learning Model For Detecting Short Chatgpt-generated Text Sandra Mitrović, Davide Andreoletti, Omran Ayoub
- Let's Have A Chat! A Conversation With Chatgpt: Technology, Applications, And Limitations Sakib Shahriar, Kadhim Hayawi
- Tinystories: How Small Can Language Models Be And Still Speak Coherent English? Ronen Eldan, Yuanzhi Li
- Palm 2 Technical Report Rohan Anil et al.
- Llama-adapter: Efficient Fine-tuning Of Language Models With Zero-init Attention Renrui Zhang et al.
- Grounded Text-to-image Synthesis With Attention Refocusing Quynh Phung, Songwei Ge, Jia-bin Huang
- Medcpt: Contrastive Pre-trained Transformers With Large-scale Pubmed Search Logs For Zero-shot Biomedical Information Retrieval Qiao Jin et al.
- Scaling Laws For Language Encoding Models In Fmri Richard Antonello, Aditya Vaidya, Alexander G. Huth
- GPT-4 Technical Report Openai et al.
- Faith And Fate: Limits Of Transformers On Compositionality Nouha Dziri et al.
- Harnessing Llms In Curricular Design: Using GPT-4 To Support Authoring Of Learning Objectives Pragnya Sridhar et al.
- Scaling Vision Transformers To 22 Billion Parameters Mostafa Dehghani et al.
- A Survey Of Large Language Models Wayne Xin Zhao et al.
- Unlocking The Potential Of Chatgpt: A Comprehensive Exploration Of Its Applications, Advantages, Limitations, And Future Directions In Natural Language Processing Walid Hariri
- Language Model Behavior: A Comprehensive Survey Tyler A. Chang, Benjamin K. Bergen
- Automated Reading Passage Generation With Openai's Large Language Model Ummugul Bezirhan, Matthias Von Davier
- Flashattention-2: Faster Attention With Better Parallelism And Work Partitioning Tri Dao
- Automatic Semantic Augmentation Of Language Model Prompts (for Code Summarization) Toufique Ahmed, Kunal Suresh Pai, Premkumar Devanbu, Earl T. Barr
- Multimodal-gpt: A Vision And Language Model For Dialogue With Humans Tao Gong et al.
- Textbooks Are All You Need Suriya Gunasekar et al.
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.
- Instructblip: Towards General-purpose Vision-language Models With Instruction Tuning Wenliang Dai et al.
- Pali-3 Vision Language Models: Smaller, Faster, Stronger Xi Chen et al.
- REPLUG: Retrieval-augmented Black-box Language Models Weijia Shi et al.
- Moviechat: From Dense Token To Sparse Memory For Long Video Understanding Enxin Song et al.
- Sparsegpt: Massive Language Models Can Be Accurately Pruned In One-shot Elias Frantar, Dan Alistarh
- Read-only Prompt Optimization For Vision-language Few-shot Learning Dongjun Lee et al.
- Visual Chatgpt: Talking, Drawing And Editing With Visual Foundation Models Chenfei Wu et al.
- Chatgpt And A New Academic Reality: Artificial Intelligence-written Research Papers And The Ethics Of The Large Language Models In Scholarly Publishing Brady Lund et al.
- RWKV: Reinventing Rnns For The Transformer Era Bo Peng et al.
- 3d-vista: Pre-trained Transformer For 3D Vision And Text Alignment Ziyu Zhu et al.
- Scaling Transformer To 1M Tokens And Beyond With RMT Aydar Bulatov, Yuri Kuratov, Yermek Kapushev, Mikhail S. Burtsev
- Chatgpt: Applications, Opportunities, And Threats Aram Bahrini et al.
- The Impact Of Positional Encoding On Length Generalization In Transformers Amirhossein Kazemnejad, Inkit Padhi, Karthikeyan Natesan Ramamurthy, Payel Das, Siva Reddy
- How Good Are GPT Models At Machine Translation? A Comprehensive Evaluation Amr Hendy et al.
- Mamba: Linear-time Sequence Modeling With Selective State Spaces Albert Gu, Tri Dao
- Calibrated Language Models Must Hallucinate Adam Tauman Kalai, Santosh S. Vempala
- Do Large Language Models Resemble Humans In Language Use? Zhenguang G. Cai, Xufeng Duan, David A. Haslett, Shuqi Wang, Martin J. Pickering
- Biomedgpt: Open Multimodal Generative Pre-trained Transformer For Biomedicine Yizhen Luo et al.
- A Comparative Study Of Pretrained Language Models For Long Clinical Text Yikuan Li, Ramsey M. Wehbe, Faraz S. Ahmad, Hanyin Wang, Yuan Luo
- Textbooks Are All You Need II: Phi-1.5 Technical Report Yuanzhi Li et al.
- Retentive Network: A Successor To Transformer For Large Language Models Yutao Sun et al.
- EVA-02: A Visual Representation For Neon Genesis Yuxin Fang et al.
- Transformers Are Ssms: Generalized Models And Efficient Algorithms Through Structured State Space Duality Tri Dao, Albert Gu
- The Era Of 1-bit Llms: All Large Language Models Are In 1.58 Bits Shuming Ma et al.
- Hidden Flaws Behind Expert-level Accuracy Of Multimodal GPT-4 Vision In Medicine Qiao Jin et al.
- From Text To Transformation: A Comprehensive Review Of Large Language Models' Versatility Pravneet Kaur et al.
- Jamba: A Hybrid Transformer-mamba Language Model Opher Lieber et al.
- Exploring Chatgpt And Its Impact On Society Md. Asraful Haque, Shuai Li
- History Of Generative Artificial Intelligence (AI) Chatbots: Past, Present, And Future Development Md. Al-amin et al.
- A Survey Of Resource-efficient LLM And Multimodal Foundation Models Mengwei Xu et al.
- Xlstm: Extended Long Short-term Memory Maximilian Beck et al.
- Linrec: Linear Attention Mechanism For Long-term Sequential Recommender Systems Langming Liu et al.
- Pixart-\sigma: Weak-to-strong Training Of Diffusion Transformer For 4K Text-to-image Generation Junsong Chen et al.
- Revolutionizing Finance With Llms: An Overview Of Applications And Insights Huaqin Zhao et al.
- Gemma 2: Improving Open Language Models At A Practical Size Gemma Team et al.
- AI And Memory Wall Amir Gholami et al.
- Yi: Open Foundation Models By 01.AI 01. Ai et al.
- Large Language Model (LLM) AI Text Generation Detection Based On Transformer Deep Learning Algorithm Yuhong Mo, Hao Qin, Yushan Dong, Ziyi Zhu, Zhenglin Li
🏷 Uncategorized
- Lingke: A Fine-grained Multi-turn Chatbot For Customer Service Pengfei Zhu, Zhuosheng Zhang, Jiangtong Li, Yafang Huang, Hai Zhao
- A Syntactically Constrained Bidirectional-asynchronous Approach For Emotional Conversation Generation Jingyuan Li, Xiao Sun
- Why Are Sequence-to-sequence Models So Dull? Understanding The Low-diversity Problem Of Chatbots Shaojie Jiang, Maarten De Rijke
- Retrieve And Refine: Improved Sequence Generation Models For Dialogue Jason Weston, Emily Dinan, Alexander H. Miller
- Response Generation By Context-aware Prototype Editing Yu Wu et al.
- Deepcopy: Grounded Response Generation With Hierarchical Pointer Networks Semih Yavuz, Abhinav Rastogi, Guan-lin Chao, Dilek Hakkani-tur
- Negated And Misprimed Probes For Pretrained Language Models: Birds Can Talk, But Cannot Fly Nora Kassner, Hinrich Schütze
- Pre-trained Language Model Representations For Language Generation Sergey Edunov, Alexei Baevski, Michael Auli
- Episodic Memory In Lifelong Language Learning Cyprien De Masson D'autume, Sebastian Ruder, Lingpeng Kong, Dani Yogatama
- Efficient Passage Retrieval With Hashing For Open-domain Question Answering Ikuya Yamada, Akari Asai, Hannaneh Hajishirzi
- \(Q^{2}\): Evaluating Factual Consistency In Knowledge-grounded Dialogues Via Question Generation And Question Answering Or Honovich et al.
- Lawformer: A Pre-trained Language Model For Chinese Legal Long Documents Chaojun Xiao, Xueyu Hu, Zhiyuan Liu, Cunchao Tu, Maosong Sun
- Human Heuristics For Ai-generated Language Are Flawed Maurice Jakesch, Jeffrey Hancock, Mor Naaman
- Incoder: A Generative Model For Code Infilling And Synthesis Daniel Fried et al.
- Turning Large Language Models Into Cognitive Models Marcel Binz, Eric Schulz
- Lost In The Middle: How Language Models Use Long Contexts Nelson F. Liu et al.
- Art Or Artifice? Large Language Models And The False Promise Of Creativity Tuhin Chakrabarty, Philippe Laban, Divyansh Agarwal, Smaranda Muresan, Chien-sheng Wu
- Generating With Confidence: Uncertainty Quantification For Black-box Large Language Models Zhen Lin, Shubhendu Trivedi, Jimeng Sun
- Understanding And Detecting Hallucinations In Neural Machine Translation Via Model Introspection Weijia Xu, Sweta Agrawal, Eleftheria Briakou, Marianna J. Martindale, Marine Carpuat
- Summarization Is (almost) Dead Xiao Pu, Mingqi Gao, Xiaojun Wan
- Active Retrieval Augmented Generation Zhengbao Jiang et al.
- Getting From Generative AI To Trustworthy AI: What Llms Might Learn From Cyc Doug Lenat, Gary Marcus
- Element-aware Summarization With Large Language Models: Expert-aligned Evaluation And Chain-of-thought Method Yiming Wang, Zhuosheng Zhang, Rui Wang
- Mol-instructions: A Large-scale Biomolecular Instruction Dataset For Large Language Models Yin Fang et al.
- Can Large Language Models Reason And Plan? Subbarao Kambhampati
- Hallucination Detection: Robustly Discerning Reliable Answers In Large Language Models Yuyan Chen et al.
🏷 Vector Indexing
🏷 WMT
- Attention Is All You Need Ashish Vaswani et al.
- Non-autoregressive Neural Machine Translation Jiatao Gu, James Bradbury, Caiming Xiong, Victor O. K. Li, Richard Socher
- Weighted Transformer Network For Machine Translation Karim Ahmed, Nitish Shirish Keskar, Richard Socher
- A Study Of Reinforcement Learning For Neural Machine Translation Lijun Wu, Fei Tian, Tao Qin, Jianhuang Lai, Tie-yan Liu
- "bilingual Expert" Can Find Translation Errors Kai Fan et al.
- The Memad Submission To The WMT18 Multimodal Translation Task Stig-arne Grönroos et al.
- Transformers Without Tears: Improving The Normalization Of Self-attention Toan Q. Nguyen, Julian Salazar
- Pay Less Attention With Lightweight And Dynamic Convolutions Felix Wu, Angela Fan, Alexei Baevski, Yann N. Dauphin, Michael Auli
- On The Use Of BERT For Neural Machine Translation Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
- Cross-lingual Language Model Pretraining Guillaume Lample, Alexis Conneau
- Insertion Transformer: Flexible Sequence Generation Via Insertion Operations Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
- The Evolved Transformer David R. So, Chen Liang, Quoc V. Le
- Towards Making The Most Of BERT In Neural Machine Translation Jiacheng Yang et al.
- Microsoft Translator At WMT 2019: Towards Large-scale Document-level Neural Machine Translation Marcin Junczys-dowmunt
- Modeling Recurrence For Transformer Jie Hao et al.
- A Generalized Framework Of Sequence Generation With Application To Undirected Sequence Models Elman Mansimov, Alex Wang, Sean Welleck, Kyunghyun Cho
- Acquiring Knowledge From Pre-trained Model To Neural Machine Translation Rongxiang Weng, Heng Yu, Shujian Huang, Shanbo Cheng, Weihua Luo
- A Tensorized Transformer For Language Modeling Xindian Ma et al.
- Syntax-infused Transformer And BERT Models For Machine Translation And Natural Language Understanding Dhanasekar Sundararaman et al.
- Analyzing Multi-head Self-attention: Specialized Heads Do The Heavy Lifting, The Rest Can Be Pruned Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
- Very Deep Transformers For Neural Machine Translation Xiaodong Liu, Kevin Duh, Liyuan Liu, Jianfeng Gao
- Non-autoregressive Machine Translation With Latent Alignments Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi
- Beyond English-centric Multilingual Machine Translation Angela Fan et al.
- Incorporating BERT Into Parallel Sequence Decoding With Adapters Junliang Guo et al.
- BLEURT: Learning Robust Metrics For Text Generation Thibault Sellam, Dipanjan Das, Ankur P. Parikh
- XLM-T: Scaling Up Multilingual Machine Translation With Pretrained Cross-lingual Transformer Encoders Shuming Ma et al.
- Lite Transformer With Long-short Range Attention Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
- HAT: Hardware-aware Transformers For Efficient Natural Language Processing Hanrui Wang et al.
- BERT, Mbert, Or Bibert? A Study On Contextualized Embeddings For Neural Machine Translation Haoran Xu, Benjamin Van Durme, Kenton Murray
- Contrastive Learning For Many-to-many Multilingual Neural Machine Translation Xiao Pan, Mingxuan Wang, Liwei Wu, Lei Li
- Mind The Gap: Assessing Temporal Generalization In Neural Language Models Angeliki Lazaridou et al.
- TURINGBENCH: A Benchmark Environment For Turing Test In The Age Of Neural Text Generation Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee
- Hierarchical Learning For Generation With Long Source Sequences Tobias Rohde, Xiaoxia Wu, Yinhan Liu
- Gtrans: Grouping And Fusing Transformer Layers For Neural Machine Translation Jian Yang et al.
- Large Language Models Are State-of-the-art Evaluators Of Translation Quality Tom Kocmi, Christian Federmann
- The Unreasonable Effectiveness Of Few-shot Learning For Machine Translation Xavier Garcia et al.