avatar

Yang Fei

Computer Science and Mathematics undergraduate @ HKUST

Publications

[1] Large Motion Video Autoencoding with Cross-modal Video VAE

Yazhou Xing*, Yang Fei*, Yingqing He*†, Jingye Chen, Jiaxin Xie, Xiaowei Chi, Qifeng Chen†
arXiv preprint, 2024

Learning a robust video Variational Autoencoder (VAE) is critical for efficient video generation and compression. This paper introduces a novel video autoencoder that achieves high-fidelity video encoding by combining temporal-aware spatial compression, lightweight motion compression, and textual guidance from text-to-video datasets. The model also supports joint training on images and videos, enhancing versatility and reconstruction quality.

Paper | Code


* Joint first authors
† Corresponding authors