Resources on Large Language Models · The Large Language Model Bible

🤖📚 Andrew Ng’s Latest LLM & Generative AI Courses

Generative AI for Everyone: Non-technical introduction to generative AI and large language models, covering prompt engineering, business applications, and strategy. Taught by Andrew Ng.
AI Python for Beginners: Learn Python API calls, chatbots, debugging, and LLM integrations. Taught by Andrew Ng.
LangChain for LLM Application Development: Build intelligent LLM apps featuring chains, memory, and QA using LangChain. Co-taught by Andrew Ng and Harrison Chase.
ChatGPT Prompt Engineering for Developers: Techniques for crafting effective prompts and building bots with the OpenAI API. Co-taught by Andrew Ng and Isa Fulford.
Building Systems with the ChatGPT API: Develop end-to-end LLM workflows and integrations using the ChatGPT API. Co-taught by Andrew Ng and Isa Fulford.

🆕 New Short Courses on DeepLearning.AI

Orchestrating Workflows for GenAI Applications: Learn to turn a GenAI or RAG prototype into a production-ready, automated pipeline using Apache Airflow. (by Astronomer)
DSPy: Build and Optimize Agentic Apps: Build, debug, and optimize AI agents using DSPy and MLflow. (by Databricks)
Reinforcement Fine-Tuning LLMs with GRPO: Improve LLM reasoning and performance with reinforcement learning using GRPO (Group Relative Policy Optimization). (by Predibase)
MCP: Build Rich-Context AI Apps with Anthropic: Build AI apps that access tools, data, and prompts using Anthropic’s Model Context Protocol. (by Anthropic)
Building AI Voice Agents for Production: Build responsive, human-like AI voice applications. (by LiveKit and RealAvatar)
LLMs as Operating Systems: Agent Memory: Build memory-augmented systems with MemGPT agents. (by Letta)
Building Code Agents with Hugging Face smolagents: Build agents that write and execute code using Hugging Face’s smolagents framework. (by Hugging Face)
Building AI Browser Agents: Build browser agents that navigate and interact with websites reliably. (by AGI Inc)
Getting Structured LLM Output: Generate structured output to power robust production LLM applications. (by DotTxt)
Vibe Coding 101 with Replit: Learn to build and deploy AI coding agents in a web-based IDE. (by Replit)
Long-Term Agentic Memory with LangGraph: Build long-memory agents using LangGraph and LangMem. (by LangChain)
Event-Driven Agentic Document Workflows: Process documents and fill forms using agent workflows with RAG. (by LlamaIndex)
Build Apps with Windsurf’s AI Coding Agents: Debug and deploy applications with Windsurf’s AI-powered IDE. (by Windsurf)
Evaluating AI Agents: Evaluate, improve, and iterate on AI agents using structured assessments. (by Arize AI)
Attention in Transformers: Concepts and Code in PyTorch: Implement the attention mechanism in PyTorch and understand its impact. (by StatQuest)
How Transformer LLMs Work: A visual and code-based introduction to the architecture behind modern LLMs. (by Jay Alammar & Maarten Grootendorst)
Building Towards Computer Use with Anthropic: Learn how AI assistants can perform real tasks on computers. (by Anthropic)
Build Long-Context AI Apps with Jamba: Create apps that handle long documents using the Jamba model. (by AI21 Labs)
Reasoning with o1: Learn how to use and prompt OpenAI’s o1 model for reasoning tasks. (by OpenAI)
Collaborative Writing and Coding with OpenAI Canvas: Collaborate with AI to write and code using OpenAI Canvas. (by OpenAI)

📝🔗 Blog Posts & Useful Links

LLM Powered Autonomous Agents: A deep dive into how large language models are powering the next generation of autonomous agents, enabling systems to perform complex tasks with minimal human input.
Google “We Have No Moat, And Neither Does OpenAI”: Leaked internal Google document discussing the competitive landscape of AI and arguing that neither Google nor OpenAI have sustainable competitive advantages in the long term.
Prompt Engineering: An introduction to prompt engineering techniques, providing guidelines on how to effectively interact with large language models to obtain the best results.
How does GPT Obtain its Ability? Tracing Emergent Abilities of Language Models to their Sources: This article investigates how GPT models acquire their emergent abilities, tracing them back to the training data and architectures used.
Why did all of the public reproduction of GPT-3 fail?: This post explores the difficulties and challenges researchers faced when attempting to reproduce the capabilities of GPT-3, offering insights into why these efforts largely fell short.
Alpaca: Synthetic Data for LLMs: Stanford’s approach to generating synthetic data for fine-tuning large language models using OpenAI’s API.
Evol-Instruct: Improving Dataset Quality: Techniques for enhancing instruction datasets with evolved synthetic data.
Orca: High-Quality Data Generation: Orca paper explaining how to generate better synthetic data through instruction following and feedback models.
Scaling Laws for LLMs: A study on scaling laws, which predict LLM performance based on model and dataset size.
Chinchilla’s Wild Implications: Insights into how the scaling laws affect LLMs’ computational efficiency.
TinyLlama: A project focused on training a Llama model from scratch, providing insights into pre-training LLMs.
BigBench: LLM Benchmarking: A large-scale benchmark for evaluating LLM capabilities across various tasks.
Training a Causal Language Model from Scratch: Hugging Face tutorial on pre-training GPT-2 from scratch using the transformers library.
LLMDataHub: Curated Datasets for LLMs: Collection of datasets for pre-training, fine-tuning, and RLHF of large language models.
Perplexity in LLMs: Hugging Face guide on measuring model perplexity for text generation tasks.
Karpathy’s Zero to Hero: GPT: A 2-hour course by Andrej Karpathy on building a GPT model from scratch, focusing on tokenization and transformer fundamentals.
Karpathy’s Intro to Tokenization: A detailed introduction to tokenization for LLMs, explaining how text is processed into tokens for transformer models.

Podcasts with Deep Learning Superstars

The AI Alignment Podcast: Conversations with leading AI researchers and thinkers like Stuart Russell, Yoshua Bengio, and more, covering cutting-edge research in AI alignment and deep learning.
Lex Fridman Podcast: Features interviews with AI pioneers like Yann LeCun, Geoffrey Hinton, Demis Hassabis, and Andrej Karpathy, discussing AI, deep learning, and the future of technology.
Machine Learning Street Talk: In-depth discussions with AI researchers such as Yannic Kilcher and Connor Leahy, tackling topics in AI ethics, deep learning, and more.
The Gradient Podcast: Interviews with researchers and practitioners in AI, deep learning, and NLP, including guests like Fei-Fei Li and Sebastian Ruder.
TWIML AI Podcast: Host Sam Charrington interviews top minds in AI and machine learning, such as Andrew Ng and Ian Goodfellow, diving deep into industry trends and research breakthroughs.
Data Skeptic: A podcast covering data science, machine learning, and AI, featuring leading experts from academia and industry, like Charles Isbell and Dario Amodei.
Andrej Karpathy – Software 3.0: A visionary keynote at AI Startup School, where Karpathy outlines the shift to “Software 3.0,” a world where foundation models are the new computing platform, and natural language becomes the new programming interface. Drawing on experience at OpenAI, Tesla, and Stanford, he explores how this transformation reshapes software, developer tools, and the future of startups.
Sam Altman on AGI, GPT‑5, and What’s Next – OpenAI Podcast Ep. 1: In the debut episode of the OpenAI Podcast, host Andrew Mayne speaks with Sam Altman about the future of AI — covering GPT‑5, AGI and superintelligence, OpenAI’s internal tools like Operator and Deep Research, AI-powered parenting (with ChatGPT as his personal assistant), and how AI is reshaping scientific workflows and productivity.

🎓🧠 Courses on Large Language Models (LLMs)

Below is a collection of university and online courses that offer a deep dive into the concepts, tools, and applications of Large Language Models (LLMs). These courses range from theoretical foundations to practical applications in business and data science.

University Courses

Stanford University - TECH 16: Large Language Models for Business with Python: This course covers the use of LLMs in business applications, with a focus on practical programming with Python. Students learn how to integrate LLMs into business processes to drive innovation and efficiency.
ETH Zürich - 263-5354-00L: Large Language Models: Focused on the theoretical underpinnings and current developments of LLMs, this course covers a broad range of topics from model training to application.
University of Toronto - COMP790-101: Large Language Models: This seminar-style course reviews the latest research on LLMs, covering both foundational knowledge and emerging trends in their development.

Online Courses

Coursera - Natural Language Processing with Transformers: This course introduces transformers, which are the foundation of modern LLMs. It focuses on using transformers for various NLP tasks such as text classification, summarization, and translation.
DataCamp - Transformer Models for NLP: Learn how to leverage transformer models to perform advanced natural language processing tasks with hands-on coding exercises in Python.
Udemy - GPT-3 and OpenAI API: A Guide for Building LLM-Powered Applications: This course provides practical insights into using GPT-3 and OpenAI’s API to build applications that utilize LLMs, with a focus on creating conversational agents and content generation.
DeepLearning.AI - Generative AI with Large Language Models: This course from DeepLearning.AI covers the key concepts of generative AI, with a particular focus on LLMs. It includes hands-on practice in fine-tuning LLMs, prompt engineering, and applying these models to real-world use cases.

Tools & Packages

LangChain: A framework for building LLM-powered applications with modular integrations, memory, and chaining prompts.
LlamaIndex: Connects LLMs with external data like documents and databases, ideal for knowledge-augmented applications.
Dyson: Enables dynamic instruction tuning and fine-tuning of LLMs with custom prompts and instructions.
LangGraph: Integrates LLMs with graph-based data, enhancing structured data querying and reasoning.
DeepSpeed: Optimizes large model training with techniques like ZeRO, quantization, and memory efficiency.
Hugging Face Transformers: Provides tools for using, fine-tuning, and deploying transformer models like GPT and BERT.
OpenRouter: An open-source alternative for routing prompts through multiple LLM APIs like GPT-4 and Claude.
Guidance: A library to guide and structure LLM outputs programmatically for complex tasks.
Haystack: A framework for building scalable LLM-powered search and retrieval systems, including RAG pipelines.
FastRAG: Efficient framework for low-latency, scalable Retrieval-Augmented Generation (RAG) pipelines.
DSPy: A library that allows you to optimize prompts and LLM outputs through programmatic evaluation.

🧰📦 Books

Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville: A foundational book that covers the principles of deep learning. It provides theoretical insights and practical applications, making it essential for understanding the building blocks of LLMs.
Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra, and Thomas Wolf: This book offers a practical guide to using transformer models for NLP tasks, with a focus on tools like Hugging Face’s libraries. It’s a great resource for anyone working with modern LLMs.
Transformers for Natural Language Processing by Denis Rothman: This book provides an in-depth look at transformer models, from BERT to GPT-3, and explains how to implement them for a variety of NLP tasks.
GPT-3: Building Innovative NLP Products Using Large Language Models by Sandra Kublik, Shubham Saboo, and Dhaval Pattani: A hands-on guide for building applications using GPT-3, covering everything from prompt engineering to integrating GPT-3 into real-world products.
Neural Networks and Deep Learning by Michael Nielsen: A classic introduction to neural networks and deep learning, providing a step-by-step guide to building and understanding deep models, which serve as the foundation for LLMs.
Hands-On Large Language Models: Language Understanding and Generation: provides practical tools for using LLMs in tasks like copywriting, summarization, and semantic search. It covers transformer architecture, generative models, and fine-tuning techniques to optimize LLMs for specific applications.

Awesome-LLM: a curated list of Large Language Mode: A comprehensive and well-maintained repository that curates resources, papers, tools, and frameworks related to Large Language Models (LLMs). It covers a wide range of topics including model architectures, training techniques, and applications.

Please, feel free to submit a web form to add more links in this page.

The Large Language Model Bible