Conv3d_cdopt

Conv3d_cdopt#

CLASS cdopt.nn.Conv3d_cdopt(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros', device=None, dtype=None, manifold_class = euclidean_torch, penalty_param = 0, weight_var_transfer = None, manifold_args = {})

Applies a 3D convolution over an input signal composed of several input planes. The basic introduction to convolution can be found at torch.nn.Conv3d.

This module supports TensorFloat32.

stride controls the stride for the cross-correlation, a single number or a one-element tuple.
padding controls the amount of padding applied to the input. It can be either a string {‘valid’, ‘same’} or a tuple of ints giving the amount of implicit padding applied on both sides.
dilation controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but this link has a nice visualization of what dilation does.
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,
- At groups=1, all inputs are convolved to all outputs.
- At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels and producing half the output channels, and both subsequently concatenated.
- At groups= in_channels, each input channel is convolved with its own set of filters (of size \(\frac{\text{out\_channels}}{\text{in\_channels}}\)).

The parameters kernel_size, stride, padding, dilation can either be:

a single int – in which case the same value is used for the height and width dimension

a tuple of two ints – in which case, the first int is used for the height dimension, and the second int for the width dimension

Parameters:#

in_channels (int) – Number of channels in the input image
out_channels (int) – Number of channels produced by the convolution
kernel_size (int or tuple) – Size of the convolving kernel
stride (int or tuple, optional) – Stride of the convolution. Default: 1
padding (int, tuple or str, optional) – Padding added to all four sides of the input. Default: 0
padding_mode (string*,* optional) – 'zeros', 'reflect', 'replicate' or 'circular'. Default: 'zeros'
dilation (int or tuple, optional) – Spacing between kernel elements. Default: 1
groups (int, optional) – Number of blocked connections from input channels to output channels. Default: 1
bias (bool, optional) – If True, adds a learnable bias to the output. Default: True
manifold_class – The manifold class for the weight matrix. Default: cdopt.manifold_torch.euclidean_torch
penalty_param – The penalty parameter for the quadratic penalty terms in constraint dissolving function
manifold_args - The additional key-word arguments that helps to define the manifold constraints.
weight_var_transfer (callable) – The function that transfer the weights (3D-tensor) to the shape of the variables of the manifold.
- The weight_var_transfer is called by
  weight_to_var, var_to_weight, var_shape = weight_var_transfer( tensor_shape )
- The inputs of weight_var_transfer should be the size of the weights. As for the outputs, weight_to_var is the callable function that transfer the weights to the variables of the manifold. var_to_weight is the callable function that transfers the variables of the manifold to the weights. var_shape is a tuple of ints that refers to the shape of the variables of the manifolds.
- Default:
  - var_shape = (torch.prod(torch.tensor(tensor_shape[:-1])), torch.tensor(tensor_shape[-1]))
  - weight_to_var = lambda X_tensor: torch.reshape(X_tensor, var_shape)
  - var_to_weight = lambda X_var: torch.reshape(X_var, tensor_shape)

Shapes:#

Input: \((N, C_{in}, D_{in}, H_{in}, W_{in})\) or \((C_{in}, D_{in},H_{in}, W_{in})\).
Output: \((N, C_{out}, D_{out}, H_{out}, W_{out})\) or \((C_{out}, D_{out}, H_{out}, W_{out})\),
where

\(D_{out} = \left\lfloor\frac{D_{in} + 2 \times \text{padding}[0] - \text{dilation}[0] \times (\text{kernel\_size}[0] - 1) - 1}{\text{stride}[0]} + 1\right\rfloor\),

\(H_{out} = \left\lfloor\frac{H_{in} + 2 \times \text{padding}[1] - \text{dilation}[1] \times (\text{kernel\_size}[1] - 1) - 1}{\text{stride}[1]} + 1\right\rfloor\),

\(W_{out} = \left\lfloor\frac{W_{in} + 2 \times \text{padding}[2] - \text{dilation}[2] \times (\text{kernel\_size}[2] - 1) - 1}{\text{stride}[2]} + 1\right\rfloor\).

Attributes:#

manifold (cdopt manifold class) – the manifold that defines the constraints. The shape of the variables in manifold is set as var_shape.
weight (Tensor) – the learnable weights of the module of shape \((\mathrm{out\_channels}, \frac{\mathrm{in\_channels}}{\mathrm{groups}},\mathrm{kernel\_size[0]}, \mathrm{kernel\_size[1]})\). . The values are initialized from var_to_weight(manifold.Init_point(weight_to_var(Xinit))), where \(\mathrm{Xinit}\sim \mathcal{U}(-\sqrt{k}, \sqrt{k})\) where \(k = \frac{\mathrm{groups}}{C_\mathrm{in} * \prod_{i=0}^{2}\mathrm{kernel\_size}[i]}\).
bias (Tensor) – the learnable bias of the module of shape (out_channels). If bias is True, then the values of these weights are sampled from \(\mathcal{U}(-\sqrt{k}, \sqrt{k})\) where \(k = \frac{\mathrm{groups}}{C_\mathrm{in} * \prod_{i=0}^{2}\mathrm{kernel\_size}[i]}\).
quad_penalty (callable) – the function that returns the quadratic penalty terms of the weights. Its return value equals to \(||\mathrm{manifold.C}(\mathrm{weight})||^2\).

Example:#

# With square kernels and equal stride
m_layer = cdopt.nn.Conv3d_cdopt(16, 33, 3, stride=2, manifold_class=cdopt.manifold_torch.sphere_torch)
# non-square kernels and unequal stride and with padding
m_layer = cdopt.nn.Conv3d_cdopt(16, 33, (3, 5, 2), stride=(2, 1, 1), padding=(4, 2, 0), manifold_class=cdopt.manifold_torch.sphere_torch)
input = torch.randn(20, 16, 10, 50, 100)
output = m_layer(input)
print(output.size())