HeyGen released HyperFrames as an open-source framework for turning HTML, CSS, and seekable animations into deterministic MP4 videos. The pitch is short and pointed: “Write HTML. Render video. Built for agents.”

That last clause is the real news. HyperFrames is not just another video renderer. It is a deliberate bet that the next generation of video production will be authored by AI coding agents, not by developers dragging timelines or writing React components. And that those agents will find plain HTML far easier to produce than JSX.

The framework ships with agent skills for Claude Code, Cursor, Gemini CLI, and Codex. A single command — npx skills add heygen-com/hyperframes — teaches the agent a production loop: plan the video, write valid HTML, wire seekable animations, add media, lint, preview, and render. The agent handles the entire pipeline from natural language prompt to MP4 file.

HyperFrames is Apache 2.0 licensed. No per-render fees, no commercial-use thresholds. The renderer uses headless Chrome and FFmpeg, seeking each frame and encoding the result. Same input always produces the same output, which makes it suitable for CI pipelines and regression testing.

The Remotion comparison is the story

HyperFrames is explicit about its competitor. The project’s README includes a direct comparison table with Remotion, the dominant open-source video renderer that uses React components as its authoring model. The differences are stark.

Remotion requires a bundler. HyperFrames does not. An index.html composition plays as-is in a browser. Remotion’s authoring model is JSX and React projects. HyperFrames uses plain HTML with data attributes for timing and tracks. Remotion’s wall-clock animation patterns need care for frame accuracy. HyperFrames uses seekable, frame-accurate animation via adapters for GSAP, CSS animations, Lottie, Three.js, Anime.js, and WAAPI.

The implication is clear. HeyGen believes that the future of video creation belongs to HTML, not to React. That is a strong claim in a developer ecosystem where React has become the default choice for complex UI work. But it makes sense when you consider the agent angle.

Coding agents are trained on vast amounts of web content. They understand HTML, CSS, and JavaScript better than they understand any framework-specific syntax. A model that can write a landing page can write a HyperFrames composition with minimal additional training. A model that needs to produce a React component must navigate JSX, component lifecycle, and bundler configuration.

HyperFrames reduces the cognitive load on the agent to the same level as writing a web page. That is the bet.

What the framework actually does

A HyperFrames composition is an HTML file with data-* attributes that define timing. A div with data-start="0" and data-duration="6" plays for six seconds. A video element with data-track-index="0" becomes the first video track. An audio element with data-volume="0.5" becomes a background music track. Animation timelines are registered on window.__timelines and seeked frame by frame.

The renderer parses the composition, drives headless Chrome to each frame position, captures the rendered output, and encodes the frames into an MP4 using FFmpeg. Audio tracks are mixed in separately. The result is deterministic — the same input always produces the same video, frame for frame.

The CLI handles scaffolding, preview with live reload, linting, inspection, and rendering. The @hyperframes/core package provides types, parsers, and generators. The @hyperframes/engine handles the capture pipeline. The @hyperframes/producer manages the full render pipeline including audio mixing. There is also an AWS Lambda render path for distributed rendering at scale.

The Catalog provides reusable blocks and components: transitions, overlays, captions, charts, maps, and effects. Install them with npx hyperframes add flash-through-white for a shader transition or npx hyperframes add data-chart for an animated chart.

The frame.md concept

HyperFrames introduces frame.md, a design system format inverted for the camera. Every brand has a design.md with tokens, colors, and typography rules. None of them were written for a video frame. frame.md takes those same tokens and rewrites them so an AI agent can compose a promo video without guessing at scale or reaching for web chrome elements.

The output is a DESIGN.md superset. Atoms stay sacred. Composition stays free. Numbers come from the script. HeyGen ships several pre-built frame.md themes: Biennale Yellow, BlockFrame, Blue Professional, Bold Poster, Broadside, Capsule, Cartesian, Cobalt Grid, Coral, and Creative Mode.

This is a smart move. It lowers the barrier for brand-consistent video production by non-designers. And it plays directly into the agent workflow — an agent can read a brand’s frame.md, understand the visual constraints, and produce a video that matches the brand without human intervention.

Why this matters for AI builders

HyperFrames is production-proven at HeyGen itself. The company uses it internally. External adopters include tldraw, TanStack, and others listed in the ADOPTERS.md file. That gives the project credibility beyond a speculative open-source release.

For AI builders, HyperFrames solves a specific problem: how to turn agent-generated content into video without human handoff. Agents can already write text, generate images, and produce code. Video has remained difficult because the tooling was designed for human editors, not automated pipelines. HyperFrames treats video as another output format for an HTML-generating agent, no different from a web page or an email template.

The deterministic rendering is critical for production use. If an agent generates a video for a customer-facing use case, the output must be consistent. HyperFrames guarantees that the same input produces the same frames. That makes it suitable for A/B testing, regression testing, and automated content pipelines where quality control is automated.

The open-source license removes the pricing uncertainty that surrounds many AI video tools. Builders can deploy HyperFrames on their own infrastructure, scale renders via AWS Lambda, and pay only for compute. No per-video licensing fees, no API rate limits, no vendor lock-in.

The open question

HyperFrames makes a strong case for HTML-native video authoring. But Remotion has years of head start, a mature cloud renderer, and a large community of developers who know React. The question is whether the agent workflow advantage is enough to overcome that inertia.

For human developers who already know React, Remotion remains the natural choice. For agents that write HTML by default, HyperFrames is the path of least resistance. As agent-driven content production grows, that advantage may become decisive.

The project is young. The documentation is thorough. The comparison table is honest. The Apache 2.0 license removes friction. The agent skills work today with Claude Code, Cursor, Gemini CLI, and Codex.

HeyGen has placed a bet on HTML as the universal intermediate representation for video. If agents become the primary authors of video content, that bet may look prescient.