optimization

Customers are increasingly using generative AI to enhance efficiency, personalize experiences, and drive innovation across various industries. For instance, generative AI can be used to perform text summarization, facilitate personalized marketing strategies, create business-critical chat-based assistants, and so on. However, as generative AI adoption grows, associated costs can escalate inContinue Reading

In the era of generative AI, new large language models (LLMs) are continually emerging, each with unique capabilities, architectures, and optimizations. Among these, Amazon Nova foundation models (FMs) deliver frontier intelligence and industry-leading cost-performance, available exclusively on Amazon Bedrock. Since its launch in 2024, generative AI practitioners, including the teamsContinue Reading

Yuewen Group is a global leader in online literature and IP operations. Through its overseas platform WebNovel, it has attracted about 260 million users in over 200 countries and regions, promoting Chinese web literature globally. The company also adapts quality web novels into films, animations for international markets, expanding theContinue Reading

DeepSeek-R1 models, now available on Amazon Bedrock Marketplace, Amazon SageMaker JumpStart, as well as a serverless model on Amazon Bedrock, were recently popularized by their long and elaborate thinking style, which, according to DeepSeek’s published results, lead to impressive performance on highly challenging math benchmarks like AIME-2024 and MATH-500, asContinue Reading

Today, Amazon SageMaker is excited to announce updates to the inference optimization toolkit, providing new functionality and enhancements to help you optimize generative AI models even faster. These updates build on the capabilities introduced in the original launch of the inference optimization toolkit (to learn more, see Achieve up toContinue Reading

Prompt engineering refers to the practice of writing instructions to get the desired responses from foundation models (FMs). You might have to spend months experimenting and iterating on your prompts, following the best practices for each model, to achieve your desired output. Furthermore, these prompts are specific to a modelContinue Reading

Today, Amazon SageMaker announced a new inference optimization toolkit that helps you reduce the time it takes to optimize generative artificial intelligence (AI) models from months to hours, to achieve best-in-class performance for your use case. With this new capability, you can choose from a menu of optimization techniques, applyContinue Reading

As generative artificial intelligence (AI) inference becomes increasingly critical for businesses, customers are seeking ways to scale their generative AI operations or integrate generative AI models into existing workflows. Model optimization has emerged as a crucial step, allowing organizations to balance cost-effectiveness and responsiveness, improving productivity. However, price-performance requirements varyContinue Reading