1. Introduction
  2. 1: Get started
  3. Project setup
  4. Model configuration
  5. 2: Build the transformer block
  6. Feed-forward network
  7. Causal masking
  8. Multi-head attention
  9. Layer normalization
  10. Transformer block
  11. 3: Assemble the model
  12. Stacking transformer blocks
  13. Language model head
  14. 4: Generate text
  15. Encode and decode tokens
  16. Text generation
  17. Load weights and run model