Infrastructure And Production
28 个课时
01 Managed LLM Platforms — Bedrock, Vertex AI, Azure OpenAI
CODE QUIZ 1 OUTPUTS
✓ → 02 Inference Platform Economics — Fireworks, Together, Baseten, Modal, Replicate, Anyscale CODE QUIZ 1 OUTPUTS
✓ → 03 GPU Autoscaling on Kubernetes — Karpenter, KAI Scheduler, Gang Scheduling CODE QUIZ 1 OUTPUTS
✓ → 04 vLLM Serving Internals: PagedAttention, Continuous Batching, Chunked Prefill CODE QUIZ 1 OUTPUTS
✓ → 05 EAGLE-3 Speculative Decoding in Production CODE QUIZ 1 OUTPUTS
✓ → 06 SGLang and RadixAttention for Prefix-Heavy Workloads CODE QUIZ 1 OUTPUTS
✓ → 07 TensorRT-LLM on Blackwell with FP8 and NVFP4 CODE QUIZ 1 OUTPUTS
✓ → 08 Inference Metrics — TTFT, TPOT, ITL, Goodput, P99 CODE QUIZ 1 OUTPUTS
✓ → 09 Production Quantization — AWQ, GPTQ, GGUF K-quants, FP8, MXFP4/NVFP4 CODE QUIZ 1 OUTPUTS
✓ → 10 Cold Start Mitigation for Serverless LLMs CODE QUIZ 1 OUTPUTS
✓ → 11 Multi-Region LLM Serving and KV Cache Locality CODE QUIZ 1 OUTPUTS
✓ → 12 Edge Inference — Apple Neural Engine, Qualcomm Hexagon, WebGPU/WebLLM, Jetson CODE QUIZ 1 OUTPUTS
✓ → 13 LLM Observability Stack Selection CODE QUIZ 1 OUTPUTS
✓ → 14 Prompt Caching and Semantic Caching Economics CODE QUIZ 1 OUTPUTS
✓ → 15 Batch APIs — the 50% Discount as Industry Standard CODE QUIZ 1 OUTPUTS
✓ → 16 Model Routing as a Cost-Reduction Primitive CODE QUIZ 1 OUTPUTS
✓ → 17 Disaggregated Prefill/Decode — NVIDIA Dynamo and llm-d CODE QUIZ 1 OUTPUTS
✓ → 18 vLLM Production Stack with LMCache KV Offloading CODE QUIZ 1 OUTPUTS
✓ → 19 AI Gateways — LiteLLM, Portkey, Kong AI Gateway, Bifrost CODE QUIZ 1 OUTPUTS
✓ → 20 Shadow Traffic, Canary Rollout, and Progressive Deployment for LLMs CODE QUIZ 1 OUTPUTS
✓ → 21 A/B Testing LLM Features — GrowthBook, Statsig, and the Vibes Problem CODE QUIZ 1 OUTPUTS
✓ → 22 Load Testing LLM APIs — Why k6 and Locust Lie CODE QUIZ 1 OUTPUTS
✓ → 23 SRE for AI — Multi-Agent Incident Response, Runbooks, Predictive Detection CODE QUIZ 1 OUTPUTS
✓ → 24 Chaos Engineering for LLM Production CODE QUIZ 1 OUTPUTS
✓ → 25 Security — Secrets, API Key Rotation, Audit Logs, Guardrails CODE QUIZ 1 OUTPUTS
✓ → 26 Compliance — SOC 2, HIPAA, GDPR, PCI-DSS, EU AI Act, ISO 42001 CODE QUIZ 1 OUTPUTS
✓ → 27 FinOps for LLMs — Unit Economics and Multi-Tenant Attribution CODE QUIZ 1 OUTPUTS
✓ → 28 Self-Hosted Serving Selection — llama.cpp, Ollama, TGI, vLLM, SGLang CODE QUIZ 1 OUTPUTS
✓ →