Annual Sale!Get 30% off in Annual PlanClaim Now→
Effortlessly generate videos from text and images with Wan 2.1
Wan 2.1 is a comprehensive and open suite of video foundation models that pushes the boundaries of video generation with state-of-the-art performance and consumer-grade GPU compatibility.
Consistently outperforms both open-source and commercial models across various benchmarks, including VBench scores for dynamic degree and multi-object interactions. Generates videos efficiently in about 15 seconds per minute of content.
The T2V-1.3B model requires only 8.19 GB VRAM, making it accessible to a wide range of users. Multiple variants are available, including T2V-1.3B and T2V-14B models, each optimized for different hardware configurations.
Supports Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio. Features multilingual support for both Chinese and English text and offers over 100 customizable artistic styles for unique visual content.
Wan 2.1 has revolutionized my video production workflow. Its text-to-video capabilities outperform existing open-source models and even some commercial solutions.
Michael Chen
Video Content Creator
As a digital creator, Wan 2.1 has become my go-to tool for turning static images into dynamic videos. The powerful Wan-VAE delivers exceptional efficiency and performance.
Sarah Johnson
Digital Artist
Wan 2.1's versatility in creating videos from both text and images has transformed our marketing campaigns. The ability to run on consumer-grade GPUs makes it accessible for our entire team.
David Wilson
Marketing Director
Using Wan 2.1 for our social media content has been a game-changer. The ability to generate high-quality videos on standard hardware makes content creation effortless.
Emily Zhang
Social Media Manager
The video quality from Wan 2.1 is exceptional. Being able to generate 480P videos in about 4 minutes on my RTX 4090 has significantly improved my content production process.
James Anderson
YouTuber
Wan 2.1's visual text generation capabilities have opened up endless possibilities for our video content. It's truly revolutionary to have a model that can generate both Chinese and English text.
Lisa Wang
Creative Director
Find out all the essential details about our AI model and how it can serve your needs.
Wan 2.1 consistently outperforms existing open-source models and state-of-the-art commercial solutions across multiple benchmarks. It supports consumer-grade GPUs, with the T2V-1.3B model requiring only 8.19 GB VRAM, making it accessible to most users.
You can access Wan 2.1 through our platform or download it from GitHub. It's available for both single-GPU and multi-GPU setups, with options for text-to-video and image-to-video generation.
Wan 2.1 excels in multiple tasks including Text-to-Video, Image-to-Video, Video Editing, Text-to-Image, and Video-to-Audio. It's also the first video model capable of generating both Chinese and English text.
Wan 2.1 offers two main models: a 1.3B model that requires only 8.19 GB VRAM and can generate a 5-second 480P video on an RTX 4090 in about 4 minutes, and a larger 14B model for even higher quality results.
Yes, the T2V-1.3B model requires only 8.19 GB VRAM, making it compatible with almost all consumer-grade GPUs. It can generate high-quality videos without requiring specialized hardware.
Yes, Wan 2.1 is the first video model capable of generating both Chinese and English text, featuring robust text generation that enhances its practical applications across different languages.