Inter-Frame Compression for Dynamic Point Cloud Geometry Coding

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2024Embargo end date: 01 Jan 2022Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Image Processing, volume 33, pages 584-594 (issn: 1057-7149, eissn: 1941-0042,

Copyright policy )Funded by:NSF | RINGS: l-RIM: Learning ba..., NSF | Phase I IUCRC University ...

Authors: Anique Akhtar; Zhu Li 0001; Geert Van der Auwera;

doi: 10.1109/tip.2023.3343096 , 10.48550/arxiv.2207.12554

pmid: 38170663

arXiv: 2207.12554

Inter-Frame Compression for Dynamic Point Cloud Geometry Coding

- Summary
- Subjects
- Metrics

Abstract

Efficient point cloud compression is essential for applications like virtual and mixed reality, autonomous driving, and cultural heritage. This paper proposes a deep learning-based inter-frame encoding scheme for dynamic point cloud geometry compression. We propose a lossy geometry compression scheme that predicts the latent representation of the current frame using the previous frame by employing a novel feature space inter-prediction network. The proposed network utilizes sparse convolutions with hierarchical multiscale 3D feature learning to encode the current frame using the previous frame. The proposed method introduces a novel predictor network for motion compensation in the feature domain to map the latent representation of the previous frame to the coordinates of the current frame to predict the current frame's feature embedding. The framework transmits the residual of the predicted features and the actual features by compressing them using a learned probabilistic factorized entropy model. At the receiver, the decoder hierarchically reconstructs the current frame by progressively rescaling the feature embedding. The proposed framework is compared to the state-of-the-art Video-based Point Cloud Compression (V-PCC) and Geometry-based Point Cloud Compression (G-PCC) schemes standardized by the Moving Picture Experts Group (MPEG). The proposed method achieves more than 88% BD-Rate (Bjontegaard Delta Rate) reduction against G-PCCv20 Octree, more than 56% BD-Rate savings against G-PCCv20 Trisoup, more than 62% BD-Rate reduction against V-PCC intra-frame encoding mode, and more than 52% BD-Rate savings against V-PCC P-frame-based inter-frame encoding mode using HEVC. These significant performance gains are cross-checked and verified in the MPEG working group.

Related Organizations

University of Missouri-Kansas City
United States
University of Missouri–Kansas City
United States
Auburn University
United States
Auburn University System
United States

Keywords

FOS: Computer and information sciences, Computer Vision and Pattern Recognition (cs.CV), Image and Video Processing (eess.IV), Computer Science - Computer Vision and Pattern Recognition, FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Multimedia, Multimedia (cs.MM)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	25
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

25

Top 10%

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Funded by

NSF| RINGS: l-RIM: Learning based Resilient Immersive Media-Compression, Delivery, and Interaction, NSF| Phase I IUCRC University of Missouri-Kansas City: Center for Big Learning (CBL)