EKS

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS | Amazon Web Services

This post is co-written with Kshitiz Gupta, Wenhan Tan, Arun Raman, Jiahong Liu, and Eiluth Triana Isaza from NVIDIA. As large language models (LLMs) and generative AI applications become increasingly prevalent, the demand for efficient, scalable, and low-latency inference solutions has grown. Traditional inference systems often struggle to meet theseContinue Reading

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock | Amazon Web Services

In: Artificial Intelligence

Generative artificial intelligence (AI) applications are commonly built using a technique called Retrieval Augmented Generation (RAG) that provides foundation models (FMs) access to additional data they didn’t have during training. This data is used to enrich the generative AI prompt to deliver more context-specific and accurate responses without continuously retrainingContinue Reading

Automate Amazon EKS troubleshooting using an Amazon Bedrock agentic workflow | Amazon Web Services

In: Artificial Intelligence

As organizations scale their Amazon Elastic Kubernetes Service (Amazon EKS) deployments, platform administrators face increasing challenges in efficiently managing multi-tenant clusters. Tasks such as investigating pod failures, addressing resource constraints, and resolving misconfiguration can consume significant time and effort. Instead of spending valuable engineering hours manually parsing logs, tracking metrics,Continue Reading

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS | Amazon Web Services

In: Artificial Intelligence

This is a guest post co-written with Tim Krause, Lead MLOps Architect at CONXAI. CONXAI Technology GmbH is pioneering the development of an advanced AI platform for the Architecture, Engineering, and Construction (AEC) industry. Our platform uses advanced AI to empower construction domain experts to create complex use cases efficiently.Continue Reading

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM | Amazon Web Services

In: Artificial Intelligence

With the rise of large language models (LLMs) like Meta Llama 3.1, there is an increasing need for scalable, reliable, and cost-effective solutions to deploy and serve these models. AWS Trainium and AWS Inferentia based instances, combined with Amazon Elastic Kubernetes Service (Amazon EKS), provide a performant and low costContinue Reading

Introducing Amazon EKS support in Amazon SageMaker HyperPod | Amazon Web Services

In: Artificial Intelligence

We are thrilled to introduce Amazon Elastic Kubernetes Service (Amazon EKS) support in Amazon SageMaker HyperPod, a purpose-built infrastructure engineered with resilience at its core. This capability allows for the seamless addition of SageMaker HyperPod managed compute to EKS clusters, using automated node and job resiliency features for foundation modelContinue Reading

Node problem detection and recovery for AWS Neuron nodes within Amazon EKS clusters | Amazon Web Services

In: Artificial Intelligence

Implementing hardware resiliency in your training infrastructure is crucial to mitigating risks and enabling uninterrupted model training. By implementing features such as proactive health monitoring and automated recovery mechanisms, organizations can create a fault-tolerant environment capable of handling hardware failures or other issues without compromising the integrity of the trainingContinue Reading

Accelerate your generative AI distributed training workloads with the NVIDIA NeMo Framework on Amazon EKS | Amazon Web Services

In: Artificial Intelligence

In today’s rapidly evolving landscape of artificial intelligence (AI), training large language models (LLMs) poses significant challenges. These models often require enormous computational resources and sophisticated infrastructure to handle the vast amounts of data and complex algorithms involved. Without a structured framework, the process can become prohibitively time-consuming, costly, andContinue Reading

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container | Amazon Web Services

In: Artificial Intelligence

Amazon Web Services is excited to announce the launch of the AWS Neuron Monitor container, an innovative tool designed to enhance the monitoring capabilities of AWS Inferentia and AWS Trainium chips on Amazon Elastic Kubernetes Service (Amazon EKS). This solution simplifies the integration of advanced monitoring tools such as PrometheusContinue Reading

EKS

Accelerate generative AI inference with NVIDIA Dynamo and Amazon EKS | Amazon Web Services

Build scalable containerized RAG based generative AI applications in AWS using Amazon EKS with Amazon Bedrock | Amazon Web Services

Automate Amazon EKS troubleshooting using an Amazon Bedrock agentic workflow | Amazon Web Services

Building the future of construction analytics: CONXAI’s AI inference on Amazon EKS | Amazon Web Services

Deploy Meta Llama 3.1-8B on AWS Inferentia using Amazon EKS and vLLM | Amazon Web Services

Introducing Amazon EKS support in Amazon SageMaker HyperPod | Amazon Web Services

Node problem detection and recovery for AWS Neuron nodes within Amazon EKS clusters | Amazon Web Services

Accelerate your generative AI distributed training workloads with the NVIDIA NeMo Framework on Amazon EKS | Amazon Web Services

Scale and simplify ML workload monitoring on Amazon EKS with AWS Neuron Monitor container | Amazon Web Services

NASA’s IXPE tracked a rare pulsar—and found an unexpected power source

Astronomers just witnessed planets being born around a baby star 1300 light-years away