
We present a layer-level analysis framework for BERT across five GLUE tasks. Using RX(River XAI), a charge-flow based interpretable learning framework, we replace BERT’s classifier with a 2–16 node interpretable network and identify removable layers through separability analysis. Our key contributions are: (1) a separability-guided layer skip method validated by actual BERT forward-pass experiments on all five tasks, (2) quantitative decomposition of FFN’s dual role — 92% structural (norm normalization) vs. 8% classification-relevant — explaining why FFN removal causes model collapse while individual layers appear “harmful” to classification, and (3) error analysis revealing that 60–93% of misclassifications are high-confidence errors (margin > 0.3), indicating BERT’s CLS representation itself is the bottleneck. RX is one application of a broader proprietary learning framework developed at River Lab; method specifics are subject to intellectual property protection.
