How it works
The agent understands three kinds of context:- Your message — What you type in the chat. Be as descriptive or brief as you want.
- Canvas context — The images and videos currently on your canvas. The agent can see and reference these.
- Workspace memory — Your saved preferences, brand guidelines, and creative patterns from past projects.
What the agent can do
The agent has access to a full suite of creative tools:| Category | Capabilities |
|---|---|
| Image generation | Create images from text descriptions, edit existing images, remove backgrounds, upscale resolution |
| Video generation | Create videos from text, animate between keyframes (first and last frame), generate reference video with visual consistency |
| Media analysis | Analyze images and videos to extract style, composition, color, and motion details |
Talking to the agent
You can be conversational or specific:- “Create a sunset over a calm ocean” — The agent picks reasonable defaults
- “Generate a 16:9 image of the main character standing in the rain, matching the color palette from the reference photo” — More specific, references canvas images
- “Animate this image with a slow zoom out” — References an existing canvas image
- “What’s the style of this uploaded photo?” — Asks the vision engine to analyze an image
Batch and parallel generation
The agent can handle multiple generation tasks at once. When you request several outputs in a single message, the agent runs them in parallel — you don’t have to wait for one to finish before the next one starts. Quantity requests:- “Generate 5 different poster concepts for this campaign”
- “Create 3 color variations of this logo”
- “Try this scene in warm tones, cool tones, and high contrast”
- “Generate both a realistic and an illustrated version”
- “Create front, side, and back views of this character” — The agent generates all three in parallel using the same reference
- “Remove the background from these 3 images”
- “Upscale all the character portraits on the canvas”
Using skills
Skills are reusable prompt templates that you invoke by typing/ in the chat input. Instead of retyping the same style direction or complex instruction every time, save it as a skill and trigger it with a short command like /cinematic or /product-shot.
When you type /, a popup shows your available skills. Select one to insert it as a visual chip in your message. You can combine multiple skills with your own text in a single message.
Skills are workspace-scoped, so your entire team shares the same set of /slash commands. See Skills for how to create and manage them.
Conversation memory
The agent remembers everything discussed within a project’s chat session. This includes:- Your creative direction and preferences expressed during the conversation
- What you liked and didn’t like about previous generations
- Character names, scene descriptions, and terminology you’ve established
- Corrections and refinements you’ve made
How generations appear
Every generation creates a new item on the canvas. Nothing is overwritten or replaced — you always keep your previous results. This means you can:- Compare multiple variations side by side
- Go back to an earlier generation at any time
- Mix and match results from different attempts
Credits
Each generation uses credits from your plan. Different operations cost different amounts — image generation is relatively inexpensive, while video generation uses more credits. See the pricing page for a full breakdown.If a generation fails (due to a model error or timeout), credits are automatically refunded to your account.

