1

TiViBench: Benchmarking Think-in-Video Reasoning for Video Generative Models

Feb 1, 2026

UniCalli: A Unified Diffusion Framework for Column-Level Generation and Recognition of Chinese Calligraphy

Jan 1, 2026

T-Rex-Omni: Integrating Negative Visual Prompt in Generic Object Detection

Jan 1, 2026

Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion

Jan 1, 2025

Uni-IR: One Stage is Enough for Ambiguity-Reduced Inverse Rendering

Jan 1, 2025

TransPixeler: Advancing Text-to-Video Generation with Transparency

Jan 1, 2025

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Jan 1, 2025

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Jan 1, 2025

SEED-Story: Multimodal Long Story Generation with Large Language Model

Jan 1, 2025

Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification

Jan 1, 2025

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Jan 1, 2025

RhythmGuassian: Repurposing Generalizable Gaussian Model For Remote Physiological Measurement

Jan 1, 2025

PRM: Photometric Stereo based Large Reconstruction Model

Jan 1, 2025

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Jan 1, 2025

POSTA: A Go-to Framework for Customized Artistic Poster Generation

Jan 1, 2025

PARM: Multi-Objective Test-Time Alignment via Preference-Aware Autoregressive Reward Model

Jan 1, 2025

Orchestrating Audio: Multi-Agent Framework for Long-Video Audio Synthesis

Jan 1, 2025

Motion inversion for video customization

Jan 1, 2025

Lotus: Diffusion-based visual foundation model for high-quality dense prediction

Jan 1, 2025

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Jan 1, 2025

GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs

Jan 1, 2025

exGen: Flexible Multi-View Generation from Text and Image Inputs

Jan 1, 2025

Event-Guided Consistent Video Enhancement with Modality-Adaptive Diffusion Pipeline

Jan 1, 2025

DrivingRecon: Large 4D Gaussian Reconstruction Model For Autonomous Driving

Jan 1, 2025

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

Jan 1, 2025

ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback

Jan 1, 2025

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

Jan 1, 2025

Text-Anchored Score Composition: Tackling Condition Misalignment in Text-to-Image Diffusion Models

Sep 29, 2024

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Sep 29, 2024

Defect Spectrum: A Granular Look of Large-scale Defect Datasets with Rich Semantics

Sep 29, 2024

Bi-TTA: Bidirectional Test-Time Adapter for Remote Physiological Measurement

Sep 29, 2024

An Incremental Unified Framework for Small Defect Inspection

Sep 29, 2024

Bridging Data Gaps in Diffusion Models with Adversarial Noise-Based Transfer Learning

Jun 21, 2024

LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

Jun 17, 2024

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

Jun 17, 2024

Learning to Remove Wrinkled Transparent Film with Polarized Prior

Jun 17, 2024

GNeRP: Gaussian-guided Neural Reconstruction of Reflective Objects with Noisy Polarization Priors

May 7, 2024

Denoising Diffusion Step-aware Models

May 7, 2024

Backdoor Contrastive Learning via Bi-level Trigger Optimization

May 7, 2024

Adv3D: Generating 3D Adversarial Examples for 3D Object Detection in Driving Scenarios with NeRF

Jan 1, 2024

Rethinking Rendering in Generalizable Neural Surface Reconstruction: A Learning-based Solution
Rethinking Rendering in Generalizable Neural Surface Reconstruction: A Learning-based Solution

Nov 10, 2023

Lift3D: Synthesize 3D Training Data by Lifting 2D GAN to 3D Generative Radiance Field

Jun 18, 2023

Neuron Structure Modeling for Generalizable Remote Physiological Measurement
Neuron Structure Modeling for Generalizable Remote Physiological Measurement

Jun 13, 2023

Semi-supervised Monocular 3D Object Detection by Multi-view Consistency

Oct 23, 2022

RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering
RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering

Oct 23, 2022

DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation
DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation

Oct 23, 2022

Representation Compensation Networks for Continual Semantic Segmentation
Representation Compensation Networks for Continual Semantic Segmentation

Jun 1, 2022

SC-GAN: Image Synthesis via Semantic Composition
SC-GAN: Image Synthesis via Semantic Composition

Jul 1, 2021

Learning to Know Where to See: A Visibility-Aware Approach for Occluded Person Re-identification
Learning to Know Where to See: A Visibility-Aware Approach for Occluded Person Re-identification

Jul 1, 2021

Delving into Deep Imbalanced Regression
Delving into Deep Imbalanced Regression

Jun 1, 2021

VCNet: A Robust Approach to Blind Image Inpainting
VCNet: A Robust Approach to Blind Image Inpainting

Jan 1, 2020

Domain Adaptive Image-to-image Translation
Domain Adaptive Image-to-image Translation

Jan 1, 2020

Attentive normalization for conditional image generation
Attentive normalization for conditional image generation

Jan 1, 2020

View Independent Generative Adversarial Network for Novel View Synthesis
View Independent Generative Adversarial Network for Novel View Synthesis

Jan 1, 2019

Semantic component decomposition for face attribute manipulation
Semantic component decomposition for face attribute manipulation

Jan 1, 2019

Homomorphic latent space interpolation for unpaired image-to-image translation
Homomorphic latent space interpolation for unpaired image-to-image translation

Jan 1, 2019

Facelet-bank for fast portrait manipulation
Facelet-bank for fast portrait manipulation

Jan 1, 2018

Makeup-go: Blind reversion of portrait edit
Makeup-go: Blind reversion of portrait edit

Jan 1, 2017

An enhanced deep feature representation for person re-identification
An enhanced deep feature representation for person re-identification

Jan 1, 2016

Mirror representation for modeling view-specific transform in person re-identification
Mirror representation for modeling view-specific transform in person re-identification

Jan 1, 2015