Stars
3
stars
written in Jupyter Notebook
Clear filter
A latent text-to-image diffusion model
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image