MagCache: Fast Video Generation with Magnitude-Aware Cache

Abstract

Existing acceleration techniques for video diffusion models often rely on uniform heuristics or time-embedding variants to skip timesteps and reuse cached features. These approaches typically require extensive calibration with curated prompts and risk inconsistent outputs due to prompt-specific overfitting. In this paper, we introduce a novel and robust discovery: a unified magnitude law observed across different models and prompts. Specifically, the magnitude ratio of successive residual outputs decreases monotonically, steadily in most timesteps while rapidly in the last several steps. Leveraging this insight, we introduce a Magnitude-aware Cache (MagCache) that adaptively skips unimportant timesteps using an error modeling mechanism and adaptive caching strategy. Unlike existing methods requiring dozens of curated samples for calibration, MagCache only requires a single sample for calibration. Experimental results show that MagCache achieves 2.68× and 2.82x speedups on Wan 2.1 and HunyuanVideo, respectively, while preserving superior visual fidelity. It significantly outperforms existing methods in LPIPS, SSIM, and PSNR, under comparable computational budgets.

Magnitude-Aware Cache (MagCache)

Motivation

Existing methods for accelerating video diffusion models via timestep skipping generally fall into two categories: uniform heuristic strategies and adaptive approaches based on prompt-specific calibration. Uniform strategies lack accuracy because they treat all timesteps equally, ignoring the dynamic nature of residual changes during the denoising process. In contrast, adaptive methods like TeaCache attempt to model residual differences using polynomial fitting, but they require extensive calibration on dozens of curated prompts—introducing risks of overfitting and limiting generalization. In this work, we uncover a simple yet robust magnitude decay law that governs the similarity of residual outputs across timesteps: the magnitude ratio between adjacent residuals decreases steadily in early steps and more sharply in later ones. Additionally, both the standard deviation of this magnitude ratio and the token-wise cosine distance remain close to zero throughout most early steps. This suggests that residual differences between adjacent steps are closely tied to the magnitude ratio. These patterns are consistent across various prompts and model variants, making magnitude a reliable and robust indicator of residual difference. Leveraging these insights, we introduce Magnitude-aware Cache (MagCache), a simple yet effective approach, for accelerating video generation.

Figure 2: Relationships between residuals across diffusion timesteps. Magnitude ratio serves as both an accurate and stable criterion for measuring the difference between residuals.

Implementation

MagCache uses a single random prompt to calibrate the importance of timesteps. Based on the resulting magnitude curve, it dynamically skips unimportant steps through accurate error modeling and an adaptive caching strategy. Compared to existing methods, our approach significantly accelerates video generation and achieves superior visual quality, while eliminating the need for costly prompt engineering and calibration.

Figure 3: MagCache is capable of adaptively caching the important intermediate output residual during the inference process.

Evaluations

Quantitative Results

Visual Results

HunyuanVideo T2V (54min05s)

TeaCache (23min49s)
PSNR: 22.80, 2.3x speedup

MagCache (19min33s)
PSNR: 26.76, 2.8x speedup

Prompt: The video shows two astronauts in bulky suits walking slowly on the moon’s surface, against a vast starry universe. Their steps are heavy and slow, kicking up dust in the low-gravity environment. The scene is silent, mysterious, and evokes the courage and dreams of space exploration.

HunyuanVideo T2V, 5s, 720P.

Wan2.1 14B I2V (30min40s)

TeaCache (13min04s)
PSNR: 13.67, 2.3x speedup

MagCache (10min03s)
PSNR: 23.67, 3.0x speedup

Prompt: A woman in black lace stands confidently in a dim Art Deco interior with polished marble floors. Stark chiaroscuro lighting highlights her sharp features as she tilts her head, crimson lips parting in a knowing smile. Her smoldering gaze meets the viewer while she turns gracefully, lace casting shifting shadows on the walls. A medium shot with a subtle dolly zoom, framed by velvet drapes, adds depth. The mysterious, refined atmosphere blends modern elegance with vintage Hollywood glamour, rendered in 8K hyper-realistic detail, metallic gold accents glowing in the soft light.

Wan2.1 I2V, 3s, 720P.

Wan2.1 14B T2V (60min04s)

TeaCache (30min01s)
PSNR: 17.39, 2.0x speedup

MagCache (21min40s)
PSNR: 24.39, 2.8x speedup

Prompt: The video shows two astronauts in bulky suits walking slowly on the moon’s surface, against a vast starry universe. Their steps are heavy and slow, kicking up dust in the low-gravity environment. The scene is silent, mysterious, and evokes the courage and dreams of space exploration.

Wan2.1 14B T2V, 5s, 720P.

Wan2.1 14B T2V (60min04s)

TeaCache (30min01s),
PSNR: 14.32, 2.0x speedup

MagCache (21min40s),
PSNR: 21.82, 2.8x speedup

Prompt: A stylish woman walks down a Tokyo street filled with warm glowing neon and animated city signage. She wears a black leather jacket, a long red dress, and black boots, and carries a black purse. She wears sunglasses and red lipstick. She walks confidently and casually. The street is damp and reflective, creating a mirror effect of the colorful lights. Many pedestrians walk about.

Wan2.1 14B T2V, 5s, 720P.

Wan2.1 1.3B T2V (189s)

TeaCache (95s)
PSNR: 14.86, 2.0x speedup

MagCache (87s)
PRNR: 20.51, 2.2x speedup

MagCache (68s)
PSNR: 18.93, 2.8x speedup

Prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.

Wan2.1 1.3B T2V, 5s, 480P.

OpenSora T2V (44.56s)

TeaCache (21.67s)
PSNR: 20.51, 2.1x speedup

MagCache (16.86s)
PSNR: 26.82, 2.6x speedup

Prompt: A tranquil tableau of an ornate Victorian streetlamp standing on a cobblestone street corner, illuminating the empty night

Wan2.1 1.3B T2V, 5s, 480P.

FramePack

TeaCache 1.92x speedup

MagCache 2.07x speedup

MagCache 2.25x speedup

Prompt: The girl dances gracefully, with clear movements, full of charm.

FramePack, 5s, 540P.

FLUX-dev (14.26s)

TeaCache (5.65s), 2.5x

MagCache (5.05s), 2.8x

Prompt: A photo of a black bicycle.

FLUX-dev

BibTeX

@misc{ma2025magcachefastvideogeneration,
      title={MagCache: Fast Video Generation with Magnitude-Aware Cache}, 
      author={Zehong Ma and Longhui Wei and Feng Wang and Shiliang Zhang and Qi Tian},
      year={2025},
      eprint={2506.09045},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.09045}, 
}