A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

1POSTECH, 2KRAFTON Inc.
arXiv preprint

With our novel neural re-parameterized optimization of 3D face meshes, we present accurate and spatio-temporally consistent 3D face mesh pseudo-labels for large-scale 2D face video datasets.

Abstract

We propose NeuFace, a 3D face mesh pseudo annotation method on videos via neural re-parameterized optimization. Despite the huge progress in 3D face reconstruction methods, generating reliable 3D face labels for in-the-wild dynamic videos remains challenging. Using NeuFace optimization, we annotate the per-view/-frame accurate and consistent face meshes on large-scale face videos, called the NeuFace-dataset. We investigate how neural re-parameterization helps to reconstruct image-aligned facial details on 3D meshes via gradient analysis. By exploiting the naturalness and diversity of 3D faces in our dataset, we demonstrate the usefulness of our dataset for 3D face-related tasks: improving the reconstruction accuracy of an existing 3D face reconstruction model and learning 3D facial motion prior.

NeuFace optimization

NeuFace optimization re-parameterizes 3D face meshes into over-parameterized neural parameters. Allen-Zhu et al., (2019) have shown that an optimization with neural over-parameterization may obtain a global optimal solution with a high probability. NeuFace optimization is performed in an Expectation-Maximization fashion, supervised by 2D landmark loss, multi-view bootstrapping loss and temporal consistency loss.

3DMM fitting vs. NeuFace optimization

Thanks to the neural re-parameterization of the 3D face meshes, NeuFace optimization obtains 3D faces with more image-aligned facial details by avoiding mean shape bias (Joo et al., 2020).

DECA vs. NeuFace optimization

Compared to a conventional 3D face mesh reconstruction method, DECA (Feng et al., 2021), NeuFace optimization obtains multi-view consistent and more stabilized facial motion, which shows the reliable quality of NeuFace-dataset's 3DMM annotation for large-scale videos.

Application: Generative 3D facial motion prior

As a practical and interesting application of NeuFace-dataset, we train a generative 3D facial motion prior. (Top row) While the existing facial motion capture dataset VOCASET is limited in learning a complex dynamic manifold of human faces, (Bottom row) we show the large-scale, naturalness and diversity of NeuFace-dataset is a key to learn such high-quality facial motion prior (bottom row).

BibTeX

@article{Youwang2023NeuFace,
  author    = {Kim Youwang and Lee Hyun and Kim Sung-Bin and Suekyeong Nam and Janghoon Ju and Tae-Hyun Oh},
  title     = {A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization},
  journal   = {arXiv preprint, arXiv:2310.03205},
  year      = {2023},
}