HyperPod

Modern generative AI model providers require unprecedented computational scale, with pre-training often involving thousands of accelerators running continuously for days, and sometimes months. Foundation Models (FMs) demand distributed training clusters — coordinated groups of accelerated compute instances, using frameworks like PyTorch — to parallelize workloads across hundreds of accelerators (likeContinue Reading

This post is based on a technical report written by Kazuki Fujii, who led the Llama 3.3 Swallow model development. The Institute of Science Tokyo has successfully trained Llama 3.3 Swallow, a 70-billion-parameter large language model (LLM) with enhanced Japanese capabilities, using Amazon SageMaker HyperPod. The model demonstrates superior performanceContinue Reading

This post was co-written with Renato Nascimento, Felipe Viana, Andre Von Zuben from Articul8. Generative AI is reshaping industries, offering new efficiencies, automation, and innovation. However, generative AI requires powerful, scalable, and resilient infrastructures that optimize large-scale model training, providing rapid iteration and efficient compute utilization with purpose-built infrastructure andContinue Reading

Climate tech startups are companies that use technology and innovation to address the climate crisis, with a primary focus on either reducing greenhouse gas emissions or helping society adapt to climate change impacts. Their unifying mission is to create scalable solutions that accelerate the transition to a sustainable, low-carbon future.Continue Reading

This post is co-written with Ken Tsui, Edward Tsoi and Mickey Yip from Apoidea Group. The banking industry has long struggled with the inefficiencies associated with repetitive processes such as information extraction, document review, and auditing. These tasks, which require significant human resources, slow down critical operations such as KnowContinue Reading

Foundation model (FM) training and inference has led to a significant increase in computational needs across the industry. These models require massive amounts of accelerated compute to train and operate effectively, pushing the boundaries of traditional computing infrastructure. They require efficient systems for distributing workloads across multiple GPU accelerated servers,Continue Reading