Supercharge your LLM performance with Amazon SageMaker Large Model Inference container v15 | Amazon Web Services
2025-04-22
Today, we’re excited to announce the launch of Amazon SageMaker Large Model Inference (LMI) container v15, powered by vLLM 0.8.4 with support for the vLLM V1 engine. This version now supports the latest open-source models, such as Meta’s Llama 4 models Scout and Maverick, Google’s Gemma 3, Alibaba’s Qwen, MistralContinue Reading