🍉的博客
首页 摄影 文章 小功能 课程 关于
首页 摄影 文章 小功能 课程 关于
← 返回课程大纲

Transformers Deep Dive

16 个课时

01 Why Transformers — The Problems with RNNs
CODE 1 OUTPUTS
✓ →
02 Self-Attention from Scratch
CODE QUIZ 1 OUTPUTS
✓ →
03 Multi-Head Attention
CODE 1 OUTPUTS
✓ →
04 Positional Encoding — Sinusoidal, RoPE, ALiBi
CODE 1 OUTPUTS
✓ →
05 The Full Transformer — Encoder + Decoder
CODE 1 OUTPUTS
✓ →
06 BERT — Masked Language Modeling
CODE 1 OUTPUTS
✓ →
07 GPT — Causal Language Modeling
CODE 1 OUTPUTS
✓ →
08 T5, BART — Encoder-Decoder Models
CODE 1 OUTPUTS
✓ →
09 Vision Transformers (ViT)
CODE 1 OUTPUTS
✓ →
10 Audio Transformers — Whisper Architecture
CODE 1 OUTPUTS
✓ →
11 Mixture of Experts (MoE)
CODE 1 OUTPUTS
✓ →
12 KV Cache, Flash Attention & Inference Optimization
CODE 1 OUTPUTS
✓ →
13 Scaling Laws
CODE 1 OUTPUTS
✓ →
14 Build a Transformer from Scratch — The Capstone
CODE 1 OUTPUTS
✓ →
15 Attention Variants — Sliding Window, Sparse, Differential
CODE 1 OUTPUTS
✓ →
16 Speculative Decoding — Draft, Verify, Repeat
CODE 1 OUTPUTS
✓ →

🍉的博客

用镜头记录生活
用文字记录思考

快速链接

  • 首页
  • 摄影
  • 文章
  • 小功能
  • AI 课程
  • 关于

社交媒体

  • GitHub
  • Twitter / X
  • Email
  • RSS
© 2026 🍉的博客 · 保留所有权利