ComfyUI wrapper nodes for HunyuanVideo
Scaled dot product attention (sdpa) should now be working (only tested on Windows, torch 2.5.1+cu124 on 4090), sageattention is still recommended for speed, but should not be necessary anymore making installation much easier.
Vid2vid test: source video
chrome_O4wUtaOQhJ.mp4
text2vid (old test):
chrome_SLgFRaGXGV.mp4
Transformer and VAE (single files, no autodownload):
https://huggingface.co/Kijai/HunyuanVideo_comfy/tree/main
Go to the usual ComfyUI folders (diffusion_models and vae)
LLM text encoder (has autodownload):
https://huggingface.co/Kijai/llava-llama-3-8b-text-encoder-tokenizer
Files go to ComfyUI/models/LLM/llava-llama-3-8b-text-encoder-tokenizer
Clip text encoder (has autodownload)
Either use any Clip_L model supported by ComfyUI by disabling the clip_model in the text encoder loader and plugging in ClipLoader to the text encoder node, or allow the autodownloader to fetch the original clip model from:
https://huggingface.co/openai/clip-vit-large-patch14, (only need the .safetensor from the weights, and all the config files) to:
ComfyUI/models/clip/clip-vit-large-patch14
Memory use is entirely dependant on resolution and frame count, don't expect to be able to go very high even on 24GB.
Good news is that the model can do functional videos even at really low resolutions.