deel.lip.normalizers module

This module contains computation function, for Bjorck and spectral normalization. This is done for internal use only.

deel.lip.normalizers.bjorck_normalization(w, eps=0.001, beta=0.5, maxiter=15)

apply Bjorck normalization on w.

Parameters:

w – weight to normalize, in order to work properly, we must have max_eigenval(w) ~= 1
eps – epsilon stopping criterion: norm(wt - wt-1) must be less than eps
beta – beta used in each iteration, must be in the interval ]0, 0.5]
maxiter – maximum number of iterations for the algorithm

Returns:

the orthonormal weights

deel.lip.normalizers.reshaped_kernel_orthogonalization(kernel, u, adjustment_coef, eps_spectral=0.001, eps_bjorck=0.001, beta=0.5, maxiter_spectral=10, maxiter_bjorck=15)

Perform reshaped kernel orthogonalization (RKO) to the kernel given as input. It apply the power method to find the largest singular value and apply the Bjorck algorithm to the rescaled kernel. This greatly improve the stability and and speed convergence of the bjorck algorithm.

Parameters:

kernel – the kernel to orthogonalize
u – the vector used to do the power iteration method
adjustment_coef – the adjustment coefficient as used in convolution
eps_spectral – stopping criterion in spectral algorithm
eps_bjorck – stopping criterion in bjorck algorithm
beta – the beta used in the bjorck algorithm
maxiter_spectral – maximum number of iterations for the power iteration
maxiter_bjorck – maximum number of iterations for bjorck algorithm

Returns: the orthogonalized kernel, the new u, and sigma which is the largest: singular value

deel.lip.normalizers.set_stop_grad_spectral(value: bool)

Set the global STOP_GRAD_SPECTRAL to values. This function must be called before constructing the model (first call of reshaped_kernel_orthogonalization) in order to be accounted.

Parameters:: value – boolean, when set to True, disable back-propagation through the power iteration algorithm. The back-propagation will account how updates affects the maximum singular value but not how it affects the largest singular vector. When set to False, back-propagate through the while loop.

deel.lip.normalizers.set_swap_memory(value: bool)

Set the global SWAP_MEMORY to values. This function must be called before constructing the model (first call of reshaped_kernel_orthogonalization) in order to be accounted.

Parameters:: value – boolean that will be used as the swap_memory parameter in while loops in spectral and bjorck algorithms.

deel.lip.normalizers.spectral_normalization(kernel, u, eps=0.001, maxiter=10)

Normalize the kernel to have it’s max eigenvalue == 1.

Parameters:

kernel – the kernel to normalize, assuming a 2D kernel
u – initialization for the max eigen vector
eps – epsilon stopping criterion: norm(ut - ut-1) must be less than eps
maxiter – maximum number of iterations for the algorithm

Returns:

the normalized kernel w_bar, it’s shape, the maximum eigen vector, and the maximum singular value

deel.lip.normalizers.spectral_normalization_conv(kernel, u, stride=1.0, conv_first=True, pad_func=None, eps=0.001, maxiter=10)

Normalize the convolution kernel to have its max eigenvalue == 1.

Parameters:

kernel – the convolution kernel to normalize
u – initialization for the max eigen vector (as a 4d tensor)
stride – stride parameter of convolutions
conv_first – RO or CO case , should be True in CO case (stride^2*C<M)
circular_paddings – Circular padding (k//2,k//2)
eps – epsilon stopping criterion: norm(ut - ut-1) must be less than eps
maxiter – maximum number of iterations for the power iteration algorithm.

Returns:

the normalized kernel w_bar, the maximum eigen vector, and the maximum eigen value