Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra

Name: Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra
Creator: Vardan Papyan
Keywords: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, 0202 electrical engineering, electronic engineering, information engineering, Machine Learning (stat.ML), 02 engineering and technology, 01 natural sciences, 0105 earth and related environmental sciences

Vardan Papyan

Found an issue? Give us feedback

arXiv.org e-Print Ar...arrow_drop_down

arXiv.org e-Print Archive

Preprint . 2020

Data sources: arXiv.org e-Print Archive

https://dx.doi.org/10.48550/ar...

Article . 2020

License: arXiv Non-Exclusive Distribution

Data sources: Datacite

DBLP

Article

Data sources: DBLP

Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2020Embargo end date: 01 Jan 2020Publisher:arXivJournal:CoRR, volume abs/2008.11865

Authors: Vardan Papyan;

doi: 10.48550/arxiv.2008.11865

arXiv: 2008.11865

Traces of Class/Cross-Class Structure Pervade Deep Learning Spectra

- Summary
- Subjects
- Related research
  (3)
- External Databases
  (1)
- Metrics

Abstract

Numerous researchers recently applied empirical spectral analysis to the study of modern deep learning classifiers. We identify and discuss an important formal class/cross-class structure and show how it lies at the origin of the many visually striking features observed in deepnet spectra, some of which were reported in recent articles, others are unveiled here for the first time. These include spectral outliers, "spikes", and small but distinct continuous distributions, "bumps", often seen beyond the edge of a "main bulk". The significance of the cross-class structure is illustrated in three ways: (i) we prove the ratio of outliers to bulk in the spectrum of the Fisher information matrix is predictive of misclassification, in the context of multinomial logistic regression; (ii) we demonstrate how, gradually with depth, a network is able to separate class-distinctive information from class variability, all while orthogonalizing the class-distinctive information; and (iii) we propose a correction to KFAC, a well-known second-order optimization algorithm for training deepnets.

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Machine Learning, Computer Vision and Pattern Recognition (cs.CV), Computer Science - Computer Vision and Pattern Recognition, Machine Learning (stat.ML), Machine Learning (cs.LG)

3 Research products, page 1 of 1

Isoform Divergence of the Filamin Family of Proteins
2009IsAmongTopNSimilarDocuments
Evolution of a Signaling Nexus Constrained by Protein Interfaces and Conformational States
2010IsAmongTopNSimilarDocuments
clusterjob software on GitHub
IsRelatedTo

4jcj

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average