01 Managed LLM Platforms — Bedrock, Vertex AI, Azure OpenAI
CODE QUIZ 1 OUTPUTS
02 Inference Platform Economics — Fireworks, Together, Baseten, Modal, Replicate, Anyscale
CODE QUIZ 1 OUTPUTS
03 GPU Autoscaling on Kubernetes — Karpenter, KAI Scheduler, Gang Scheduling
CODE QUIZ 1 OUTPUTS
04 vLLM Serving Internals: PagedAttention, Continuous Batching, Chunked Prefill
CODE QUIZ 1 OUTPUTS
05 EAGLE-3 Speculative Decoding in Production
CODE QUIZ 1 OUTPUTS
06 SGLang and RadixAttention for Prefix-Heavy Workloads
CODE QUIZ 1 OUTPUTS
07 TensorRT-LLM on Blackwell with FP8 and NVFP4
CODE QUIZ 1 OUTPUTS
08 Inference Metrics — TTFT, TPOT, ITL, Goodput, P99
CODE QUIZ 1 OUTPUTS
09 Production Quantization — AWQ, GPTQ, GGUF K-quants, FP8, MXFP4/NVFP4
CODE QUIZ 1 OUTPUTS
10 Cold Start Mitigation for Serverless LLMs
CODE QUIZ 1 OUTPUTS
11 Multi-Region LLM Serving and KV Cache Locality
CODE QUIZ 1 OUTPUTS
12 Edge Inference — Apple Neural Engine, Qualcomm Hexagon, WebGPU/WebLLM, Jetson
CODE QUIZ 1 OUTPUTS
13 LLM Observability Stack Selection
CODE QUIZ 1 OUTPUTS
14 Prompt Caching and Semantic Caching Economics
CODE QUIZ 1 OUTPUTS
15 Batch APIs — the 50% Discount as Industry Standard
CODE QUIZ 1 OUTPUTS
16 Model Routing as a Cost-Reduction Primitive
CODE QUIZ 1 OUTPUTS
17 Disaggregated Prefill/Decode — NVIDIA Dynamo and llm-d
CODE QUIZ 1 OUTPUTS
18 vLLM Production Stack with LMCache KV Offloading
CODE QUIZ 1 OUTPUTS
19 AI Gateways — LiteLLM, Portkey, Kong AI Gateway, Bifrost
CODE QUIZ 1 OUTPUTS
20 Shadow Traffic, Canary Rollout, and Progressive Deployment for LLMs
CODE QUIZ 1 OUTPUTS
21 A/B Testing LLM Features — GrowthBook, Statsig, and the Vibes Problem
CODE QUIZ 1 OUTPUTS
22 Load Testing LLM APIs — Why k6 and Locust Lie
CODE QUIZ 1 OUTPUTS
23 SRE for AI — Multi-Agent Incident Response, Runbooks, Predictive Detection
CODE QUIZ 1 OUTPUTS
24 Chaos Engineering for LLM Production
CODE QUIZ 1 OUTPUTS
25 Security — Secrets, API Key Rotation, Audit Logs, Guardrails
CODE QUIZ 1 OUTPUTS
26 Compliance — SOC 2, HIPAA, GDPR, PCI-DSS, EU AI Act, ISO 42001
CODE QUIZ 1 OUTPUTS
27 FinOps for LLMs — Unit Economics and Multi-Tenant Attribution
CODE QUIZ 1 OUTPUTS
28 Self-Hosted Serving Selection — llama.cpp, Ollama, TGI, vLLM, SGLang
CODE QUIZ 1 OUTPUTS