descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Oct 2015Embargo end date: 01 Jan 2012Publisher:Institute of Electrical and Electronics Engineers (IEEE)Journal:IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 37, pages 2,119-2,130 (issn: 0162-8828, eissn: 2160-9292,

Authors: Shie Mannor; Maayan Harel;

doi: 10.1109/tpami.2015.2404836 , 10.48550/arxiv.1210.4006

pmid: 26353188

arXiv: http://arxiv.org/abs/1210.4006

The Perturbed Variation

- Summary
- Subjects
- Metrics

Abstract

We introduce a new discrepancy score between two distributions that gives an indication on their similarity. While much research has been done to determine if two samples come from exactly the same distribution, much less research considered the problem of determining if two finite samples come from similar distributions. The new score gives an intuitive interpretation of similarity; it optimally perturbs the distributions so that they best fit each other. The score is defined between distributions, and can be efficiently estimated from samples. We provide convergence bounds of the estimated score, and develop hypothesis testing procedures that test if two data sets come from similar distributions. The statistical power of this procedures is presented in simulations. We also compare the score's capacity to detect similarity with that of other known measures on real data.

Related Organizations

Technion – Israel Institute of Technology
Israel

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Machine Learning (stat.ML), Machine Learning (cs.LG)

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	1
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering

Fields of Science

engineering and technology

electrical engineering, electronic engineering, information engineering