[For Patreon Supporters] Wan 2.2 SoundToVideo - I2V & V2V Workflow (Ver. 20250828)
Added 2025-08-28 13:36:32 +0000 UTC
Related Post : https://www.patreon.com/posts/wan-2-2-sound-to-137551273
Tutorial Video : https://youtu.be/MegoM8KSO_s
Attached 2 workflows demo in this tutorial video.
Resources:
wan2.2 s2v in Comfy
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files
Models Your Need To Get This working
-----------------------------------
models/diffusion_models
-----------------------------------
wan2.2_s2v_14B_bf16 (For High VRAM):
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/diffusion_models/wan2.2_s2v_14B_bf16.safetensors
WanVideo_comfy_fp8_scaled/S2V (For Low VRAM):
https://huggingface.co/Kijai/WanVideo_comfy_fp8_scaled/tree/main/S2V
Wan2.2-S2V-14B-GGUF (For Low VRAM):
https://huggingface.co/QuantStack/Wan2.2-S2V-14B-GGUF
models/audio_encoders
-----------------------------------
wav2vec2_large_english_fp16 :
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/audio_encoders/wav2vec2_large_english_fp16.safetensors
models/Lora
-----------------------------------
wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors :
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/blob/main/split_files/loras/wan2.2_i2v_lightx2v_4steps_lora_v1_low_noise.safetensors
models/text_encoders
-----------------------------------
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/text_encoders
models/vae
-----------------------------------
https://huggingface.co/Comfy-Org/Wan_2.2_ComfyUI_Repackaged/tree/main/split_files/vae
https://github.com/benjiyaya/ComfyUI-Logic
option for Audio Separation : https://huggingface.co/Kijai/MelBandRoFormer_comfy/tree/main
Hard to explain everything in one video, so feel free to leave your question in the comment section or discuss it in Patreon Discord.


Comments
This question is too broad, A lots of facts can cause this.
Benjamin Law
2025-09-12 17:44:23 +0000 UTCwhen I set it to 720p, the video generation freezes when it gets to the ksampler, any clue?
Nicolas Giarrusso
2025-09-12 16:07:26 +0000 UTCinstallation -wise , yes, using native node everything already packed. Hardware requirements wise, no, Wan 2.2 required a lots more VRam on computing, some YouTube false claim, said oh 8GB VRam to run Wan 2.2. Really? generate a little 3-5 seconds chip took him half a hour. Inf.Talk required less VRam. Lipsync quality, 480p 50/50 for both. 720p, Wan 2.2 got some more detail, but not too much.
Benjamin
2025-08-28 14:18:49 +0000 UTCI hate to ask but overall is the WAN 2.2 S2V even worth it compared to infinite talk? I ask because not even you sounded too impressed with it in the video. Are there any advantages to using WAN S2V over Infinite talk?
Russ Ader
2025-08-28 14:03:41 +0000 UTC