Text-to-image basics with Amazon Nova Canvas | Amazon Web Services

AI image generation has emerged as one of the most transformative technologies in recent years, revolutionizing how you create and interact with visual content. Amazon Nova Canvas is a generative model in the suite of Amazon Nova creative models that enables you to generate realistic and creative images from plain text descriptions.

This post serves as a beginner’s guide to using Amazon Nova Canvas. We begin with the steps to get set up on Amazon Bedrock. Amazon Bedrock is a fully managed service that hosts leading foundation models (FMs) for various use cases such as text, code, and image generation; summarization; question answering; and custom use cases that involve fine-tuning and Retrieval Augmented Generation (RAG). In this post, we focus on the Amazon Nova image generation models available in AWS Regions in the US, in particular, the Amazon Nova Canvas model. We then provide an overview of the image generation process (diffusion) and dive deep into the input parameters for text-to-image generation with Amazon Nova Canvas.

Get started with image generation on Amazon Bedrock

Complete the following steps to get setup with access to Amazon Nova Canvas and the image playground:

  1. Create an AWS account if you don’t have one already.
  2. Open the Amazon Bedrock console as an AWS Identity and Access Management (IAM) administrator or appropriate IAM user.
  3. Confirm and choose one of the Regions where the Amazon Nova Canvas model is available (for example, US East (N. Virginia)).
  4. In the navigation pane, choose Model access under Bedrock configurations.
    Expandable Bedrock configurations menu with Model access and Settings items
  1. Under What is Model access, choose Modify model access or Enable specific models (if not yet activated).Amazon Bedrock model access explanation with permissions, terms, and quota link for foundation models
  1. Select Nova Canvas, then choose Next.Amazon Bedrock model access editor displaying filtered view of Nova Canvas image model with access controls
  1. On the Review and submit page, choose Submit.Final review interface for Amazon Bedrock Nova Canvas model access request with edit and navigation options
  1. Refresh the Base models
    If you see the Amazon Nova Canvas model in the Access Granted status, you are ready to proceed with the next steps.
    Nova Canvas model status row showing granted access
  1. In the navigation pane, choose Image / Video under Playgrounds.
    Playgrounds menu showing Chat/Text and Image/Video options
  1. Choose Select model, then choose Amazon and Nova Canvas. Then choose Apply.Model selection dialog with categories, available models list, and inference options for AI image generation

You are all set up to start generating images with Amazon Nova Canvas on Amazon Bedrock. The following screenshot shows an example of our playground.

Nova Canvas image generation interface showing configuration panel and two example flower vase photos

Understanding the generation process

Amazon Nova Canvas uses diffusion-based approaches to generate images:

  • Starting point – The process begins with random noise (a pure static image).
  • Iterative denoising – The model gradually removes noise in steps, guided by your prompts. The amount of noise to remove at each step is learned at training. For instance, for a model to generate an image of a cat, it has to be trained on multiple cat images, and iteratively insert noise into the image until it is complete noise. When learning the amount of noise to add at each step, the model effectively learns the reverse process, starting with a noisy image and iteratively subtracting noise to arrive at the image of a cat.
  • Text conditioning – The text prompt serves as the conditioning that guides the image generation process. The prompt is encoded as a numerical vector, referenced against similar vectors in a text-image embedding space that corresponds to images, and then using these vectors, a noisy image is transformed into an image that captures the input prompt.
  • Image conditioning – In addition to text prompts, Amazon Nova Canvas also accepts images as inputs.
  • Safety and fairness – To comply with safety and fairness goals, both the prompt and the generated output image go through filters. If no filter is triggered, the final image is returned.

Prompting fundamentals

Image generation begins with effective prompting—the art of crafting text descriptions that guide the model toward your desired output. Well-constructed prompts include specific details about subject, style, lighting, perspective, mood, and composition, and work better when structured as image captions rather than a command or conversation. For example, rather than saying “generate an image of a mountain,” a more effective prompt might be “a majestic snow-capped mountain peak at sunset with dramatic lighting and wispy clouds, photorealistic style.” Refer to Amazon Nova Canvas prompting best practices for more information about prompting.

Let’s address the following prompt elements and observe their impact on the final output image:

  • Subject descriptions (what or who is in the image) – In the following example, we use the prompt “a cat sitting on a chair.”

Striped cat with bright eyes resting on wooden dining chair in warm lighting

  • Style references (photography, oil painting, 3D render) – In the following examples, we use the prompts “A cat sitting on a chair, oil painting style” and then “A cat sitting on a chair, anime style.”

Digital artwork of cat resting on wooden chair, painted in soft brushstrokes with warm golden tones

Stylized animation of tabby cat resting peacefully on wooden armchair, bathed in warm window light

  • Compositional elements and technical specifications (foreground, background, perspective, lighting) – In the following examples, we use the prompts “A cat sitting on a chair, mountains in the background,” and “A cat sitting on a chair, sunlight from the right low angle shot.”

Cat sitting on wooden chair with snow-capped mountains in background

Detailed portrait of alert tabby cat on wooden chair, backlit by golden afternoon sunlight

Positive and negative prompts

Positive prompts tell the model what to include. These are the elements, styles, and characteristics you want to observe in the final image. Avoid the use of negation words like “no,” “not,” or “without” in your prompt. Amazon Nova Canvas has been trained on image-caption pairs, and captions rarely describe what isn’t in an image. Therefore, the model has never learned the concept of negation. Instead, use negative prompts to specify elements to exclude from the output.

Negative prompts specify what to avoid. Common negative prompts include “blurry,” “distorted,” “low quality,” “poor anatomy,” “bad proportions,” “disfigured hands,” or “extra limbs,” which help models avoid typical generation artifacts.

In the following examples, we first use the prompt “An aerial view of an archipelago,” then we refine the prompt as “An aerial view of an archipelago. Negative Prompt: Beaches.”

Aerial view of tropical islands with turquoise waters and white beach

Aerial view of forested islands scattered across calm ocean waters

The balance between positive and negative prompting creates a defined creative space for the model to work within, often resulting in more predictable and desirable outputs.

Image dimensions and aspect ratios

Amazon Nova Canvas is trained on 1:1, portrait and landscape resolutions, with generation tasks having a maximum output resolution of 4.19 million pixels (that is, 2048×2048, 2816×1536). For editing tasks, the image should be 4,096 pixels on its longest side, have an aspect ratio between 1:4 and 4:1, and have a total pixel count of 4.19 million or smaller. Understanding dimensional limitations helps avoid stretched or distorted results, particularly for specialized composition needs.

Classifier-free guidance scale

The classifier-free guidance (CFG) scale controls how strictly the model follows your prompt:

  • Low values (1.1–3) – More creative freedom for the AI, potentially more aesthetic, but low contrast and less prompt-adherent results
  • Medium values (4–7) – Balanced approach, typically recommended for most generations
  • High values (8–10) – Strict prompt adherence, which can produce more precise results but sometimes at the cost of natural aesthetics and increased color saturation

In the following examples, we use the prompt “Cherry blossoms, bonsai, Japanese style landscape, high resolution, 8k, lush greens in the background.”

The first image with CFG 2 captures some elements of cherry blossoms and bonsai. The second image with CFG 8 adheres more to the prompt with a potted bonsai, more pronounced cherry blossom flowers, and lush greens in the background.

Miniature cherry blossom tree with pink blooms cascading over moss-covered rock near peaceful pond

Cherry blossom bonsai with curved trunk and pink flowers in traditional pot against green landscape

Think of CFG scale as adjusting how literally your instructions are taken into consideration vs. how much artistic interpretation it applies.

Seed values and reproducibility

Every image generation begins with a randomization seed—essentially a starting number that determines initial conditions:

  • Seeds are typically represented as long integers (for example, 1234567890)
  • Using the same seed, prompt, and parameters reproduces identical images every time
  • Saving seeds allows you to revisit successful generations or create variations on promising results
  • Seed values have no inherent quality; they are simply different starting points

Reproducibility through seed values is essential for professional workflows, allowing refined iterations on the prompt or other input parameters to clearly see their effect, rather than completely random generations. The following images are generated using two slightly different prompts (“A portrait of a girl smiling” vs. “A portrait of a girl laughing”), while holding the seed value and all other parameters constant.

All preceding images in this post have been generated using the text-to-image (TEXT_IMAGE) task type of Amazon Nova Canvas, available through the Amazon Bedrock InvokeModel API. The following is the API request and response structure for image generation:

#Request Structure
{
    "taskType": "TEXT_IMAGE",
    "textToImageParams": {
        "text": string,         #Positive Prompt
        "negativeText": string  #Negative Prompt
    },
    "imageGenerationConfig": {
        "width": int,           #Image Resolution Width
        "height": int,          #Image Resolution Width
        "quality": "standard" | "premium",   #Image Quality
        "cfgScale": float,      #Classifer Free Guidance Scale
        "seed": int,            #Seed value
        "numberOfImages": int   #Number of images to be generated (max 5)
    }
}
#Response Structure
{
    "images": "images": string[], #list of Base64 encoded images
    "error": string
}

Code example

This solution can also be tested locally with a Python script or a Jupyter notebook. For this post, we use an Amazon SageMaker AI notebook using Python (v3.12). For more information, see Run example Amazon Bedrock API requests using an Amazon SageMaker AI notebook. For instructions to set up your SageMaker notebook instance, refer to Create an Amazon SageMaker notebook instance. Make sure the instance is set up in the same Region where Amazon Nova Canvas access is enabled. For this post, we create a Region variable to match the Region where Amazon Nova Canvas is enabled (us-east-1). You must modify this variable if you’ve enabled the model in a different Region. The following code demonstrates text-to-image generation by invoking the Amazon Nova Canvas v1.0 model using Amazon Bedrock. To understand the API request and response structure for different types of generations, parameters, and more code examples, refer to Generating images with Amazon Nova.

import base64  #For encoding/decoding base64 data
import io  #For handling byte streams
import json  #For JSON processing
import boto3  #AWS SDK for Python
from PIL import Image  #Python Imaging Library for image processing
from botocore.config import Config  #For AWS client configuration

#Create a variable to fix the region to where Nova Canvas is enabled
region = "us-east-1"

#Setup an Amazon Bedrock runtime client
client = boto3.client(service_name="bedrock-runtime", region_name=region, config=Config(read_timeout=300))

#Set the content type and accept headers for the API call
accept = "application/json"
content_type = "application/json"

#Define the prompt for image generation
prompt = """A cat sitting on a chair, mountains in the background, low angle shot."""

#Create the request body with generation parameters
api_request= json.dumps({
        "taskType": "TEXT_IMAGE",  #Specify text-to-image generation
        "textToImageParams": {
            "text": prompt  
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,   #Generate one image
            "height": 720,        #Image height in pixels
            "width": 1280,         #Image width in pixels
            "cfgScale": 7.0,       #CFG Scale
            "seed": 0              #Seed number for generation
        }
})
#Call the Bedrock model to generate the image
response = client.invoke_model(body=api_request, modelId='amazon.nova-canvas-v1:0', accept=accept, 
contentType=content_type)
        
#Parse the JSON response
response_json = json.loads(response.get("body").read())

#Extract the base64-encoded image from the response
base64_image = response_json.get("images")[0]
#Convert the base64 string to ASCII bytes
base64_bytes = base64_image.encode('ascii')
#Decode the base64 bytes to get the actual image bytes
image_data = base64.b64decode(base64_bytes)

#Convert bytes to an image object
output_image = Image.open(io.BytesIO(image_data))
#Display the image
output_image.show()
#Save the image to current working directory
output_image.save('output_image.png')

Clean up

When you have finished testing this solution, clean up your resources to prevent AWS charges from being incurred:

  1. Back up the Jupyter notebooks in the SageMaker notebook instance.
  2. Shut down and delete the SageMaker notebook instance.

Cost considerations

Consider the following costs from the solution deployed on AWS:

  • You will incur charges for generative AI inference on Amazon Bedrock. For more details, refer to Amazon Bedrock pricing.
  • You will incur charges for your SageMaker notebook instance. For more details, refer to Amazon SageMaker pricing.

Conclusion

This post introduced you to AI image generation, and then provided an overview of accessing image models available on Amazon Bedrock. We then walked through the diffusion process and key parameters with examples using Amazon Nova Canvas. The code template and examples demonstrated in this post aim to get you familiar with the basics of Amazon Nova Canvas and get started with your AI image generation use cases on Amazon Bedrock.

For more details on text-to-image generation and other capabilities of Amazon Nova Canvas, see Generating images with Amazon Nova. Give it a try and let us know your feedback in the comments.


About the Author

Arjun Singh is a Sr. Data Scientist at Amazon, experienced in artificial intelligence, machine learning, and business intelligence. He is a visual person and deeply curious about generative AI technologies in content creation. He collaborates with customers to build ML and AI solutions to achieve their desired outcomes. He graduated with a Master’s in Information Systems from the University of Cincinnati. Outside of work, he enjoys playing tennis, working out, and learning new skills.