Linear_cdopt

Linear_cdopt#

CLASS cdopt.nn.Linear_cdopt(in_features, out_features, bias=True, device=None, dtype=None, manifold_class = euclidean_torch, penalty_param = 0, weight_var_transfer = None, manifold_args = {})

Applies a linear transformation to the incoming data: y=xAT+b, where the weight matrix A is restricted over the manifold specified by manifold_class.

This module supports TensorFloat32, and is developed based on torch.nn.Linear.

Parameters:#

  • in_features – size of each input sample

  • out_features – size of each output sample

  • bias – If set to False, the layer will not learn an additive bias. Default: True

  • manifold_class – The manifold class for the weight matrix. Default: cdopt.manifold_torch.euclidean_torch

  • penalty_param – The penalty parameter for the quadratic penalty terms in constraint dissolving function

  • manifold_args - The additional key-word arguments that helps to define the manifold constraints.

Shape:#

  • Input: (,Hin) where means any number of dimensions including none and Hin=in_features.

  • Output: (,Hout) where all but the last dimension are the same shape as the input and Hout=out_features.

Attributes:#

  • manifold (cdopt manifold class) – the manifold that defines the constraints. The shape of the variables in manifold is set as (out_features,in_features) if out_featuresin_features. Otherwise, it is set as (in_features,out_features).

  • weight (torch.Tensor) – the learnable weights of the module of shape (out_features,in_features). The values are initialized from manifold.Init_point(Xinit), where XinitU(k,k) with k=1in_features.

  • bias – the learnable bias of the module of shape (out_features). If bias is True, the values are initialized from U(k,k) where k=1in_features.

  • quad_penalty (callable) – the function that returns the quadratic penalty terms of the weights. Its return value equals to ||manifold.C(weight)||2.

Example:#

my_layer = cdopt.nn.Linear_cdopt(20, 30, manifold_class = cdopt.manifold_torch.symp_stiefel_torch)
input = torch.randn(128, 20)
output = my_layer(input)
print(output.size())
# expected to print torch.Size([128, 30])