
This paper presents a novel two-pass dynamic time warping (DTW) approach to build Query-by-Example Spoken Term Detection (QbE-STD) system for Zero Resource Languages. An unconstrained-endpoint dynamic time warping (UE-DTW) algorithm is used to locate the query term occurrences in a long conversational audio. The proposed approach uses a segmental DTW, wherein search is carried out only at syllable boundaries. This reduces the search complexity by 9 times compared to conventional sliding window DTW. The first pass of the proposed method uses a minimum set of templates for a keyword to search through the segmented audio. New templates are identified after the first pass. In the second pass, the initial templates along with the new templates identified in the first pass are used to search for the keyword occurrences. A novel score normalization technique is also proposed, in which the syllables constituting the keyword are used for normalization. The performance of the proposed two-pass system is shown to be better than the single pass systems. The proposed score normalization technique further improves the overall detection results.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
