
Relu function is a non-linear function. But if we formulate it as a hard constraint in linear programming, we can make it becomes a convex optimization problem as linear programming is convex. If we will to imagine Relu function as a linear inequality in linear programming, it will look something like inequalities. If we look at an example of feasible region in linear programming graphically, it will contains a few lines of inequalities where green strips mark the invalid regions. So in a BSnet Relu network, the Relu activation functions will act like inequalities and the BSnet can be represented by linear programming. The linear transform by the weights in a layer in BSnet will reorientate the inequality hyperplanes in the linear programming.
