
Version identification (VI) or cover song identification is the task of automatically detecting whether musical tracks are originating from the same work. An important step in approaches for solving this task is the shingling along the time axis. Recently proposed models show a better retrieval accuracy when using shorter shingles (e.g., of 20 seconds) rather than relying on longer ones (e.g., more than 1 minute) or even full tracks. However, all current approaches define a fixed length for the shingles, even though the actual segments in a musical sense (e.g., chorus) are usually varying in length and might even vary between different versions of the same musical work. This case study explores new perspectives on VI beyond fixed-length shingles. Based on a new VI dataset with manually annotated segment labels, we investigate the distributions of pairwise distances of version embeddings obtained from a state-of-the-art VI model. We further examine the impact of different shingle-to-segment offsets to uncover the potential performance degradation in current VI testing methods.
Version identification, Segment, Shingle
Version identification, Segment, Shingle
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
