Text Guided Image Manipulation Using GAN Inversion

Views: 2

Recent GAN models are capable of generating very highquality images. Then, a very important followup problem is, how to control these generated images. A careful analysis of the latent space of GANs suggests that this control can be achieved by manipulating the latent codes in a desired direction. In this talk, we will be presenting our model that is capable of modifying images in such a way that they have some desired attributes corresponding to any text description. For this purpose, we use the idea of GAN inversion. Our model makes use of two encoders to invert the images along with their textual descriptions to the latent space of a pretrained StyleGAN model. Additionally, we utilize OpenAI s Contrastive LanguageImage Pretraining (CLIP) model to enforce the latent codes to be aligned with the desired textual descriptions. The inverted latent codes are fed to the StyleGAN generator to obtain the manipulated images. We conducted experiments on face datasets and compared our results with the related