Scaling Video Creation with the Synthesia API: Best Practices for Dynamic Media and Text

Scaling Video Creation with the Synthesia API: Best Practices for Dynamic Media and Text
Photo by s2 art / Unsplash

As personalized video becomes more critical in marketing, support, and internal comms, scaling content efficiently is a game-changer. With Synthesia’s API, you can automate the creation of hundreds (or thousands) of videos with dynamic elements — such as personalized text or variable images and videos — all within your existing workflows.

In this post, we’ll explore best practices when using images, videos, and text as variables through the Synthesia API. Whether you’re generating localized product explainers or personalized outreach videos, these tips will help you ensure quality and consistency across scale.


Handling Dynamic Media: Images and Videos as Variables

When inserting an image or video dynamically into a template, choosing how it fits within the canvas is critical. The Synthesia platform offers three fitting modes:

1. Crop

Crops the media to fill the element completely. Some parts may be cut off.

➡️ Use when framing is consistent and cropping is acceptable.

2. Contain

Ensures the entire image or video is visible, adding padding if needed.

➡️ Best for demos, screenshots, or content where every pixel matters.

3. Cover

Fills the element, cropping edges as needed to maintain aspect ratio.

➡️ Ideal for full-screen imagery or abstract backgrounds.

Media fitting mode in Synthesia

Choosing the right fit mode

  • Product demos: Use contain to avoid cropping out critical details.
  • Mood-setting imagery: cover works well here, even for videos — cropping is acceptable as the content is illustrative rather than informative.
  • General visuals: crop offers a clean, centered look but should be used with caution.

Design Tip: Enhance Layout with Background Layers

When using contain, the original image or video might not fully fill the designated area. To avoid awkward gaps:

  1. Place a plain shape or brand-colored background behind the media.
  2. Overlay the image or video on top.
  3. Optionally, use an abstract looping video to add visual interest.

This setup helps maintain a consistent visual layout, even when media comes in with varying aspect ratios.

Background video for empty spaces

Handling Text as a Variable

Dynamic text introduces unique layout challenges — especially when the input length varies widely.

The Common Pitfall: Spacing Issues

Stacking multiple text elements (e.g., name + title + company) can lead to:

  • Overlapping text
  • Inconsistent gaps
  • Unexpected layout breaks

Best Practice: Use a Single Text Element

Combine dynamic text into a single text element. This ensures:

  • Consistent line spacing
  • No vertical alignment issues
  • Easier styling and layout control

For example:

Key Values

{{sc3_value_1}}

{{sc3_value_2}}

{{sc3_value_3}}
Single block for all text variables to maintain spaces

Text Element Types

Synthesia provides two main types of elements for dynamic text:

  • Text block: Ideal for tight vertical alignment. Expands from the bottom
  • Label block: Useful when you want the element to stay centered as it grows. Expands from the top and bottom
Label element for text input in Synthesia

Controlling Input Length via API

To keep your layout safe:

  • Enforce character limits in your script or LLM prompt.
  • Validate inputs before video generation.
  • Use sentence truncation or summarization if needed.

Brand Kit Tip for Labels

If you’re using a label element with a background (even transparent), avoid using a brand kit color. This ensures that updates to the brand kit won’t unexpectedly affect your label styling in future renders.


Bonus: Using Media to Fill Sparse Text Areas

When the dynamic text area might be short (e.g., just a name), you can visually enrich the scene:

  • Overlay a muted abstract video behind the text.
  • Use a gradient or lightly animated shape as filler.

This ensures that even text-light scenes look intentional and visually engaging.


Conclusion

Using Synthesia’s API to create videos at scale unlocks powerful personalization opportunities — but it’s also a design challenge. By mastering how images, videos, and text behave as dynamic variables, you can ensure your automated videos look polished and consistent.

To recap:

  • Use contain for detailed images, cover for aesthetics, and crop for uniform framing.
  • Consolidate dynamic text into one element.
  • Choose the right text block type and control input length via API.
  • Layer media smartly to maintain visual harmony.

Want to dive deeper? Check out the Synthesia API documentation or come to our community for more tips and tricks.

Read more