Generates a weight matrix descrambler for a particular layer in a neural network using Tikhonov smoothness criterion. The particulars are described in https://arxiv.org/abs/1912.01498
S - a matrix containing, in its columns, the outputs of the preceding layers of the neural network for a (preferably large) number of reasonable inputs n_iter - maximum number of Newton-Raphson interations, 400 is generally sufficient guess - [optional] the initial guess for the descrambling transform generator (lower triangle is used), a reasonable choice is a zero matrix (default)
P - descrambling matrix. In the case when the network is wiretapped before the activation function, i.e. S = Wf(W...f(Wf(WX))) matrix P descrambles the output dimension of the left-most W. In the case when the network is wire tapped after the activation function, i.e. S = f(Wf(W...f(Wf(WX)))) matrix inv(P) descrambles the input dimension of the weight matrix of the subsequent layer.
An example of this function being applied to DEERNet is published in https://arxiv.org/abs/1912.01498