Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2020Embargo end date: 01 Jan 2020Publisher:arXivJournal:CoRR, volume abs/2006.14150

Authors: Jing Shi 0003; Xuankai Chang; Pengcheng Guo; Shinji Watanabe 0001; Yusuke Fujita; Jiaming Xu 0001; Bo Xu 0002; +1 Authors

doi: 10.48550/arxiv.2006.14150

arXiv: 2006.14150

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

Neural sequence-to-sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence. In this work, we focus on one-to-many sequence transduction problems, such as extracting multiple sequential sources from a mixture sequence. We extend the standard sequence-to-sequence model to a conditional multi-sequence model, which explicitly models the relevance between multiple output sequences with the probabilistic chain rule. Based on this extension, our model can conditionally infer output sequences one-by-one by making use of both input and previously-estimated contextual output sequences. This model additionally has a simple and efficient stop criterion for the end of the transduction, making it able to infer the variable number of output sequences. We take speech data as a primary test field to evaluate our methods since the observed speech data is often composed of multiple sources due to the nature of the superposition principle of sound waves. Experiments on several different tasks including speech separation and multi-speaker speech recognition show that our conditional multi-sequence models lead to consistent improvements over the conventional non-conditional models.

15 pages, 5 figures

Related Organizations

Johns Hopkins University
United States
Institute of Automation
China (People's Republic of)
Chinese Academy of Sciences (中国科学院)
China (People's Republic of)
Chinese Academy of Science
China (People's Republic of)
INSTITUTE OF AUTOMATION CHINESE ACADEMY OF SCIENCES
China (People's Republic of)

View all View all

Keywords

FOS: Computer and information sciences, Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing

1 Research products, page 1 of 1

Conv-TasNet software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals

1 Research products, page 1 of 1

Conv-TasNet software on GitHub