Downloads provided by UsageCounts
arXiv: 1507.05143
handle: 10161/12013
We introduce a novel low level feature for identifying cover songs which quantifies the relative changes in the smoothed frequency spectrum of a song. Our key insight is that a sliding window representation of a chunk of audio can be viewed as a time-ordered point cloud in high dimensions. For corresponding chunks of audio between different versions of the same song, these point clouds are approximately rotated, translated, and scaled copies of each other. If we treat MFCC embeddings as point clouds and cast the problem as a relative shape sequence, we are able to correctly identify 42/80 cover songs in the "Covers 80" dataset. By contrast, all other work to date on cover songs exclusively relies on matching note sequences from Chroma derived features.
cs.SD, FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Sound
cs.SD, FOS: Computer and information sciences, Sound (cs.SD), Computer Science - Sound
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 3 | |
| downloads | 5 |

Views provided by UsageCounts
Downloads provided by UsageCounts