Downloads provided by UsageCounts
MIDI-sheet music alignment is the task of finding correspondences between a MIDI representation of a piece and its corresponding sheet music images. Rather than using optical music recognition to bridge the gap between sheet music and MIDI, we explore an alternative approach: projecting the MIDI data into pixel space and performing alignment in the image domain. Our method converts the MIDI data into a crude representation of the score that only contains rectangular floating notehead blobs, a process we call bootleg score synthesis. Furthermore, we project sheet music images into the same bootleg space by applying a deep watershed notehead detector and filling in the bounding boxes around each detected notehead. Finally, we align the bootleg representations using a simple variant of dynamic time warping. On a dataset of 68 real scanned piano scores from IMSLP and corresponding MIDI performances, our method achieves a 97.3% accuracy at an error tolerance of one second, outperforming several baseline systems that employ optical music recognition.
8 pages, 6 figures, 1 table. Accepted paper at the International Society for Music Information Retrieval Conference (ISMIR) 2019
FOS: Computer and information sciences, Sound (cs.SD), Audio and Speech Processing (eess.AS), Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Multimedia (cs.MM)
FOS: Computer and information sciences, Sound (cs.SD), Audio and Speech Processing (eess.AS), Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, Electrical Engineering and Systems Science - Image and Video Processing, Computer Science - Multimedia, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing, Multimedia (cs.MM)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 22 | |
| downloads | 10 |

Views provided by UsageCounts
Downloads provided by UsageCounts