Compact 3D Gaussian Representation for Radiance Field

CVPR 2024 (Highlight)

Joo Chan Lee¹, Daniel Rho², Xiangyu Sun¹, Jong Hwan Ko¹, and Eunbyung Park¹

¹Sungkyunkwan University, ²KT

[Code] [Paper]

Architecture Overview

3D Gaussian splatting (3DGS) has recently emerged as an alternative representation that leverages a 3D Gaussisan-based representation and adopts the rasterization pipeline to render the images rather than volumetric rendering, achieving very fast rendering speed and promising image quality. However, a significant drawback arises as 3DGS entails a substantial number of 3D Gaussians to maintain the high fidelity of the rendered images, which requires a large amount of memory and storage. To address this critical issue, we place a specific emphasis on two key objectives: reducing the number of Gaussian points without sacrificing performance and compressing the Gaussian attributes, such as view-dependent color and covariance. To this end, we propose a learnable mask strategy that significantly reduces the number of Gaussians while preserving high performance. In addition, we propose a compact but effective representation of view-dependent color by employing a grid-based neural field rather than relying on spherical harmonics. Finally, we learn codebooks to compactly represent the geometric attributes of Gaussian by vector quantization. With model compression techniques such as quantization and entropy coding, we consistently show over 25x reduced storage and enhanced rendering speed, while maintaining the quality of the scene representation, compared to 3DGS. Our work provides a comprehensive framework for 3D scene representation, achieving high performance, fast training, compactness, and real-time rendering.

The effect of masking

Masking can significantly reduce the number of Gaussians while retaining high quality.

The detailed process of R-VQ

In the first stage, the scale and rotation vectors are compared to codes in each codebook, with the closest code identified as the result. In the next stage, the residual between the original vector and the first stage's result is compared with another codebook. This process is repeated up to the final stage.

This reduces the storage requirements by approximately 30% while maintaining the quality of reconstruction, training time, and rendering speed.

Results

In addition to the proposed method (Ours), we implemented straightforward post-processing techniques on the model attributes, a variant we denote as Ours+PP. These post-processing steps include: 1) Applying 8-bit min-max quantization to opacity and hash grid parameters. 2) Pruning hash grid parameters with values below 0.1. 3) Applying Huffman encoding on the quantized opacity and hash parameters, and R-VQ indices.

Dataset	Mip-NeRF 360						Tanks&Temples
Method	PSNR	SSIM	LPIPS	Train	FPS	Storage	PSNR	SSIM	LPIPS	Train	FPS	Storage
Plenoxels	23.08	0.626	0.463	25m 49s	6.79	2.1 GB	21.08	0.719	0.379	25m 05s	13.0	2.3 GB
INGP-base	25.30	0.671	0.371	05m 37s	11.7	13 MB	21.72	0.723	0.330	05m 26s	17.1	13 MB
INGP-big	25.59	0.699	0.331	07m 30s	9.43	48 MB	21.92	0.745	0.305	06m 59s	14.4	48 MB
Mip-NeRF 360	27.69	0.792	0.237	48h	0.06	8.6 MB	22.22	0.759	0.257	48h	0.14	8.6 MB
3DGS	27.21	0.815	0.214	41m 33s	134	734 MB	23.14	0.841	0.183	26m 54s	154	411 MB
3DGS*	27.46	0.812	0.222	24m 07s	120	746 MB	23.71	0.845	0.178	13m 51s	160	432 MB
Ours	27.08	0.798	0.247	33m 06s	128	48.8 MB	23.32	0.831	0.201	18m 20s	185	39.4 MB
Ours+PP	27.03	0.797	0.247	-	-	29.1 MB	23.32	0.831	0.202	-	-	20.9 MB

Dataset	Deep Blending
Method	PSNR	SSIM	LPIPS	Train	FPS	Storage
Plenoxels	23.06	0.795	0.510	27m 49s	11.2	2.7 GB
INGP-base	23.62	0.797	0.423	06m 31s	3.26	13 MB
INGP-big	24.96	0.817	0.390	08m 00s	2.79	48 MB
Mip-NeRF 360	29.40	0.901	0.245	48h	0.09	8.6 MB
3DGS	29.41	0.903	0.243	36m 02s	137	676 MB
3DGS*	29.46	0.900	0.247	21m 52s	132	663 MB
Ours	29.79	0.901	0.258	27m 33s	181	43.2 MB
Ours+PP	29.73	0.900	0.258	-	-	23.8 MB

Bibtex

@article{lee2023compact,
  title={Compact 3D Gaussian Representation for Radiance Field},
  author={Lee, Joo Chan and Rho, Daniel and Sun, Xiangyu and Ko, Jong Hwan and Park, Eunbyung},
  journal={arXiv preprint arXiv:2311.13681},
  year={2023}
}

We used the project page of Masked Wavelet NeRF as a template.