
doi: 10.1145/3485664
Synthesize human motions from music (i.e., music to dance) is appealing and has attracted lots of research interests in recent years. It is challenging because of the requirement for realistic and complex human motions for dance, but more importantly, the synthesized motions should be consistent with the style, rhythm, and melody of the music. In this article, we propose a novel autoregressive generative model, DanceNet, to take the style, rhythm, and melody of music as the control signals to generate 3D dance motions with high realism and diversity. Due to the high long-term spatio-temporal complexity of dance, we propose the dilated convolution to improve the receptive field, and adopt the gated activation unit as well as separable convolution to enhance the fusion of motion features and control signals. To boost the performance of our proposed model, we capture several synchronized music-dance pairs by professional dancers and build a high-quality music-dance pair dataset. Experiments have demonstrated that the proposed method can achieve state-of-the-art results.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 101 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
