# Llama4-Maverick This document shows how to run Llama4-Maverick on B200 with PyTorch workflow and how to run performance benchmarks ## Table of Contents - [Performance Benchmarks](#performance-benchmarks) - [B200 Max-throughput](#b200-max-throughput) - [B200 Min-latency](#b200-min-latency) - [B200 Balanced](#b200-balanced) - [Advanced Configuration](#advanced-configuration) - [Configuration tuning](#configuration-tuning) - [Troubleshooting](#troubleshooting) - [Out of memory issues](#out-of-memory-issues) ## Performance Benchmarks This section provides the steps to launch TensorRT LLM server and run performance benchmarks for different scenarios. ### B200 Max-throughput #### 1. Prepare TensorRT LLM extra configs ```bash cat >./extra-llm-api-config.yml <./extra-llm-api-config.yml <./extra-llm-api-config.yml <