Photo-Realistic Out-of-domain GAN inversion via Invertibility Decomposition


The fidelity of Generative Adversarial Networks (GAN) inversion is impeded by Out-Of-Domain (OOD) areas (e.g., background, accessories) in the image. Detecting the OOD areas beyond the generation ability of the pre-trained model and blending these regions with the input image can enhance fidelity. The ``invertibility mask'' figures out these OOD areas, and existing methods predict the mask with the reconstruction error. However, the estimated mask is usually inaccurate due to the influence of the reconstruction error in the In-Domain (ID) area. In this paper, we propose a novel framework that enhances the fidelity of human face inversion by designing a new module to decompose the input images to ID and OOD partitions with invertibility masks. Unlike previous works, our invertibility detector is simultaneously learned with a spatial alignment module. We iteratively align the generated features to the input geometry and reduce the reconstruction error in the ID regions. Thus, the OOD areas are more distinguishable and can be precisely predicted. Then, we improve the fidelity of our results by blending the OOD areas from the input image with the ID GAN inversion results. Our method produces photo-realistic results for real-world human face image inversion and manipulation. Extensive experiments demonstrate our method’s superiority over existing methods in the quality of GAN inversion and attribute manipulation.

In Proceedings of the IEEE International Conference on Computer Vision (ICCV)
Ying-Cong Chen
Ying-Cong Chen
Assistant Professor

Ying-Cong Chen is an Assistant Professor at AI Thrust, Information Hub of Hong Kong University of Science and Technology (Guangzhou Campus). He obtained his Ph.D. degree from the Chinese University of Hong Kong. His research lies in the broad area of computer vision and machine learning, aiming for empowering machine with the capacity to understand human appearance, physiology and psychology. His works contribute to a wide range of applications, including contactless health monitoring, semantic photo synthesis, and intelligent video surveillance.