- Introduction
- 1: Get started
- Project setup
- Model configuration
- 2: Build the transformer block
- Feed-forward network
- Causal masking
- Multi-head attention
- Layer normalization
- Transformer block
- 3: Assemble the model
- Stacking transformer blocks
- Language model head
- 4: Generate text
- Encode and decode tokens
- Text generation
- Load weights and run model