Step 2: Reproduce Large Language Model (from scratch)

Let’s Continue with the Second Step

At this point, you've built a strong foundation in generative deep learning. Now it’s time to dive into how models like ChatGPT are actually built.

Table of Contents

book-open Build a Large Language Model (From Scratch)

This book is an excellent next step. Written by Sebastian Raschka, it guides you through the full process of developing a GPT-like Large Language Model (LLM) from scratch.

You'll Learn How To

  • Understand the architecture of LLMs

  • Implement your own transformer model

  • Pretrain it on a text corpus

  • Finetune it for specific tasks

The book is highly practical and comes with a comprehensive GitHub repositoryarrow-up-right full of hands-on examples built in PyTorch. Don’t worry if you’re new to PyTorch, the book even includes an appendix to help you get started with it.

Outline of the Book

  • Ch 1: Understanding Large Language Models

  • Ch 2: Working with Text Data

  • Ch 3: Coding Attention Mechanisms

  • Ch 4: Implementing a GPT Model from Scratch

  • Ch 5: Pretraining on Unlabeled Data

  • Ch 6: Finetuning for Text Classification

  • Ch 7: Finetuning to Follow Instructions

Unlike the first book where you could skip around, the chapters in this book are sequentially dependent, each one builds directly on the previous. So, it's best to read the book from Chapter 1 all the way through to Chapter 7, without skipping.

What You'll Achieve

Once you finish this book, you'll not only understand how LLMs like ChatGPT work, but you'll also have built your own simplified version.

And with that, you’re ready for Step 3: Reinforcement Learning from Human Feedback (RLHF), which is important for LLM reasoning.

Want to Go Deeper?

This book focuses primarily on practical implementation. If you’re also interested in the mathematical theory behind LLMs (which not everyone is), there’s a helpful preprint on arXiv: Foundations of Large Language Modelsarrow-up-right that complements the material.

Very Useful Alternative Resource

If you prefer video over textbooks, one of the best resources for learning about LLMs comes from one of the most prominent researchers in the field, Andrej Karpathy. His video series is a must-watch due to the sheer depth and clarity of the explanations. Not only does he teach in an easy-to-understand manner, but he also codes everything step by step.

Last updated