Born Again ReLU (BAReLU): A Branchless Multi-Regime Fusion Activation Function for Resilient Gradient Propagation and Saturated Signal Decoupling

Anthony Aseervatham

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Report

Data sources: ZENODO

Born Again ReLU (BAReLU): A Branchless Multi-Regime Fusion Activation Function for Resilient Gradient Propagation and Saturated Signal Decoupling

descriptionPublicationkeyboard_double_arrow_right Report Under curationPublisher:Zenodo

Authors: Anthony Aseervatham;

doi: 10.5281/zenodo.20559130

Born Again ReLU (BAReLU): A Branchless Multi-Regime Fusion Activation Function for Resilient Gradient Propagation and Saturated Signal Decoupling

- Summary

Abstract

Born Again ReLU (BAReLU): A Branchless Multi-Regime Fusion Activation Function for Resilient Gradient Propagation and Saturated Signal Decoupling Author: Anthony Aseervatham (Independent Researcher) Overview This repository contains the research paper, experimental data, and supporting figures for BAReLU (Born Again ReLU), a novel activation function designed to eliminate the "Dying ReLU" problem while maintaining zero computational overhead. The Problem Standard Rectified Linear Units (ReLU) suffer from a critical flaw known as the Dying ReLU epidemic: during training, neurons can enter permanently inactive states where both forward activations and backward gradients become zero. This results in an irreversible loss of representational capacity, with up to 15% of network neurons becoming permanently dead. While alternatives like Leaky ReLU prevent absolute neuron death, they introduce negative noise accumulation and harsh gradient discontinuities at the origin that destabilize training. The Solution: BAReLU BAReLU introduces a Multi-Regime Fusion Framework that cleanly separates two behavioral domains: Active Regime (x > 0): Pristine identity mapping identical to standard ReLU — zero feature distortion Dormant Recovery Regime (x <= 0): Linear recovery lane with alpha = 0.01 — prevents neuron death while minimizing negative noise leakage BAReLU's distinctive innovation is its Quadratic-Root System (sqrt(max(0,x)^2)) which conditions gradient flow at the origin boundary, producing smoother optimization dynamics than both ReLU and Leaky ReLU. The implementation uses branchless torch.where vectorization, achieving hardware execution parity with standard ReLU (less than 1.11% wall-clock overhead). Key Results CIFAR-10 (3 seeds: 42, 1337, 2026) BAReLU: 68.32% accuracy, 0.00% neuron death | Standard ReLU: 66.47% accuracy, 15.03% neuron death CIFAR-100 Fine-Grained (100 classes) BAReLU: 31.39% accuracy, 0.00% neuron death | Standard ReLU: 29.57% accuracy, 12.11% neuron death CartPole-v1 via PPO (389 policy updates) BAReLU: 165.60 final reward | Standard ReLU: 138.60 | Leaky ReLU: 106.05 Gradient Conditioning Validation -31.1% reduction in gradient norms vs Leaky ReLU -38.3% reduction in weight update norms vs Leaky ReLU

Found an issue? Give us feedback