
arXiv: 1710.11469
handle: 20.500.11850/455459
AbstractWhen training a deep neural network for image classification, one can broadly distinguish between two types of latent features of images that will drive the classification. We can divide latent features into (i) ‘core’ or ‘conditionally invariant’ features$$C$$Cwhose distribution$$C\vert Y$$C|Y, conditional on the classY, does not change substantially across domains and (ii) ‘style’ features$$S$$Swhose distribution$$S\vert Y$$S|Ycan change substantially across domains. Examples for style features include position, rotation, image quality or brightness but also more complex ones like hair color, image quality or posture for images of persons. Our goal is to minimize a loss that is robust under changes in the distribution of these style features. In contrast to previous work, we assume that the domain itself is not observed and hence a latent variable. We do assume that we can sometimes observe a typically discrete identifier or “$$\mathrm {ID}$$IDvariable”. In some applications we know, for example, that two images show the same person, and$$\mathrm {ID}$$IDthen refers to the identity of the person. The proposed method requires only a small fraction of images to have$$\mathrm {ID}$$IDinformation. We group observations if they share the same class and identifier$$(Y,\mathrm {ID})=(y,\mathrm {id})$$(Y,ID)=(y,id)and penalize the conditional variance of the prediction or the loss if we condition on$$(Y,\mathrm {ID})$$(Y,ID). Using a causal framework, this conditional variance regularization (CoRe) is shown to protect asymptotically against shifts in the distribution of the style variables in a partially linear structural equation model. Empirically, we show that the CoRepenalty improves predictive accuracy substantially in settings where domain changes occur in terms of image quality, brightness and color while we also look at more complex changes such as changes in movement and posture.
FOS: Computer and information sciences, Computer Science - Machine Learning, distributional robustness, Classification and discrimination; cluster analysis (statistical aspects), Anti-causal prediction, Causal models, Image classifcation, Learning and adaptive systems in artificial intelligence, causal models, Domain shift, dataset shift, Dataset shift, Machine Learning (stat.ML), Machine Learning (cs.LG), Domain shift; Dataset shift; Causal models; Distributional robustness; Anti-causal prediction; Image classifcation, domain shift, Statistics - Machine Learning, Distributional robustness, anti-causal prediction, Artificial neural networks and deep learning, image classification
FOS: Computer and information sciences, Computer Science - Machine Learning, distributional robustness, Classification and discrimination; cluster analysis (statistical aspects), Anti-causal prediction, Causal models, Image classifcation, Learning and adaptive systems in artificial intelligence, causal models, Domain shift, dataset shift, Dataset shift, Machine Learning (stat.ML), Machine Learning (cs.LG), Domain shift; Dataset shift; Causal models; Distributional robustness; Anti-causal prediction; Image classifcation, domain shift, Statistics - Machine Learning, Distributional robustness, anti-causal prediction, Artificial neural networks and deep learning, image classification
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 36 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
