Okapi: Generalising Better by Making Statistical Matches Match

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Conference object 01 Jan 2022Embargo end date: 01 Jan 2022Publisher:arXivJournal:CoRR, volume abs/2211.05236Funded by:EC | BayesianGDPR

Authors: Myles Bartlett; Sara Romiti; Viktoriia Sharmanska; Novi Quadrianto;

doi: 10.48550/arxiv.2211.05236

arXiv: 2211.05236

Okapi: Generalising Better by Making Statistical Matches Match

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

We propose Okapi, a simple, efficient, and general method for robust semi-supervised learning based on online statistical matching. Our method uses a nearest-neighbours-based matching procedure to generate cross-domain views for a consistency loss, while eliminating statistical outliers. In order to perform the online matching in a runtime- and memory-efficient way, we draw upon the self-supervised literature and combine a memory bank with a slow-moving momentum encoder. The consistency loss is applied within the feature space, rather than on the predictive distribution, making the method agnostic to both the modality and the task in question. We experiment on the WILDS 2.0 datasets Sagawa et al., which significantly expands the range of modalities, applications, and shifts available for studying and benchmarking real-world unsupervised adaptation. Contrary to Sagawa et al., we show that it is in fact possible to leverage additional unlabelled data to improve upon empirical risk minimisation (ERM) results with the right method. Our method outperforms the baseline methods in terms of out-of-distribution (OOD) generalisation on the iWildCam (a multi-class classification task) and PovertyMap (a regression task) image datasets as well as the CivilComments (a binary classification task) text dataset. Furthermore, from a qualitative perspective, we show the matches obtained from the learned encoder are strongly semantically related. Code for our paper is publicly available at https://github.com/wearepal/okapi/.

Proceeding of NeurIPS 2022

Related Organizations

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (cs.LG)

1 Research products, page 1 of 1

oatapi software on GitHub
IsRelatedTo

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average