LongCat Avatar 1.5 Multi
Animate two people in one image from separate audio tracks with per-speaker lip sync and turn-taking (up to 64s).
Overview
LongCat Avatar 1.5 Multi is a vidutils model available on the GenVR platform. Animate two people in one image from separate audio tracks with per-speaker lip sync and turn-taking (up to 64s).
Pricing
Billed through GenVR credits
20 credits per 5s at 480p, 40 per 5s at 720p. Meanwhile: max(left,right) audio. Sequential: left+right audio (min 5s, max 64s)
Properties
Customizable parameters available for this model.
Required
Single image with two people (left and right). Clear faces work best.
Audio track for the person on the left (trimmed to 64s max per job)
Audio track for the person on the right (trimmed to 64s max per job)
left_right / right_left: sequential. meanwhile: both speak at the same time (billed by longer track).
Optional
Guide expression, pose, or visual style for both speakers
Output resolution: 480p or 720p
Random seed for reproducibility (-1 for random)
GenVR Visual App
Experience the power of LongCat Avatar 1.5 Multi through our intuitive visual interface. Experiment with prompts, adjust parameters in real-time, and download your results instantly.
Launch AppDeveloper API Docs
Integrate this model into your own applications. Access enterprise-grade performance, scalable infrastructure, and detailed documentation for rapid deployment.
Explore APIMore in vidutils
Discover other high-performance models in the same category as LongCat Avatar 1.5 Multi.