Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

Name: Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation
Keywords: Computer Science - Computer Vision and Pattern Recognition

Zhe Zhao; Mengshi Qi; Huadong Ma

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2024

Data sources: arXiv.org e-Print Archive

https://doi.org/10.1007/978-3-...

Part of book or chapter of book . 2024 . Peer-reviewed

License: Springer Nature TDM

Data sources: Crossref

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

descriptionPublicationkeyboard_double_arrow_right Part of book or chapter of book , Preprint 03 Nov 2024 English Publisher:Springer Nature Switzerland

Authors: Zhe Zhao; Mengshi Qi; Huadong Ma;

doi: 10.1007/978-3-031-73397-0_26

arXiv: 2407.14062

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Generating realistic human grasps is a crucial yet challenging task for applications involving object manipulation in computer graphics and robotics. Existing methods often struggle with generating fine-grained realistic human grasps that ensure all fingers effectively interact with objects, as they focus on encoding hand with the whole representation and then estimating both hand posture and position in a single step. In this paper, we propose a novel Decomposed Vector-Quantized Variational Autoencoder (DVQ-VAE) to address this limitation by decomposing hand into several distinct parts and encoding them separately. This part-aware decomposed architecture facilitates more precise management of the interaction between each component of hand and object, enhancing the overall reality of generated human grasps. Furthermore, we design a newly dual-stage decoding strategy, by first determining the type of grasping under skeletal physical constraints, and then identifying the location of the grasp, which can greatly improve the verisimilitude as well as adaptability of the model to unseen hand-object interaction. In experiments, our model achieved about 14.1% relative improvement in the quality index compared to the state-of-the-art methods in four widely-adopted benchmarks. Our source code is available at https://github.com/florasion/D-VQVAE.

Comment: To be published in The 18th European Conference on Computer Vision ECCV 2024

Related Organizations

State Key Laboratory of Networking and Switching Technology
China (People's Republic of)
Beijing University of Posts and Telecommunications
China (People's Republic of)

Keywords

Computer Science - Computer Vision and Pattern Recognition

1 Research products, page 1 of 1

vdvae software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

Decomposed Vector-Quantized Variational Autoencoder for Human Grasp Generation

1 Research products, page 1 of 1

vdvae software on GitHub