Jump to documentation:

dl/layers.hpp

Overview

Types and Functions

• template <typename T, unsigned int index, int... w> class WeightRef
• template <typename T> concept GenericLayer = requires(T a, Tensor &t1, Tensor &t2, Tensor &t3, Tensor &t4, AdamFactory fac, std::vector grads)
• struct UntrainableLayer
• template <typename F, int... wn> class Layer
  • template <typename... args> Layer(args... weights)
  • template <int index, int dim> void set_weight(Tensor t)
  • void set_weights(const std::vector weights)
  • template <int index> Tensor<F, get_dim<index, wn...>()> &get_weight()
  • template <OptimizerFactory Fac> void generate_optimizer(Fac factory)
  • template <typename T, unsigned int dim> void optimize_weights(const Tensor &error)
  • std::vector<FGraphNode *> collect_weights()
  • void optimize_weights(std::vector grads)
  • virtual std::string name()
  • virtual std::string description()

template <typename T, unsigned int index, int... w> class WeightRef

FOR INTERNAL USE ONLY builds an compile-time linked list of Tensor pointer

template <typename T>
concept GenericLayer =
	requires(T a, Tensor &t1, Tensor &t2,
			 Tensor &t3, Tensor &t4, AdamFactory fac,
			 std::vector grads)

Concept of methods a Layer for neural networks has to implement. Mind the static constexpr methods that determine the modifications of dimensionality and types of the input tensors

int
 transform_dimensionality(int)

and

FType transform_type(FType)

, they describe the type of your forward (i.e. if a tensor of dimensionality

and type

is inserted into your forward, a tensor of dimensionality

transform_dimensionality(n)

and type

transform_type(T)

should be returned). It is highly recommended to derive your Layer from

UntrainableLayer

Layer

, since they provide already implementations for some methods.

forward

may consume its input tensor since it isn't needed afterwards.

struct UntrainableLayer

Implements blank methods for every method of GenericLayer that is not needed for a Layer that is not trainable. If you derive from this class you have to implement the

forward

method from the

GenericLayer

concept and - if the forward outputs another type or dimensionality then its parameter has - overload

transform_type

and

transform_dimensionality

template <typename F, int... wn> class Layer

Virtual super class of all Layer implementations with type safe weight management capabilities. The variadic template describes the dimensionality of the individual weights i.e. a

Layer<double, 3,4,5>

has three weights:

Tensor<double, 3>

Tensor<double, 4>

Tensor<double, 5>

. You have to initialize them by providing their initial state in the constructor, after that you may access references to them with the function

get_weight<int index>()

If you derive from this class you have to implement the

forward

method from the

GenericLayer

concept and - if the

forward

outputs another type or dimensionality then its parameter has - overload

transform_type

and

transform_dimensionality

template <typename... args> Layer(args... weights)

Initializes the weights by copying the provided ones. After that you may access them with

get_weight<int index>()

template <int index, int dim> void set_weight(Tensor t)

Sets a specific weight described by its index

void set_weights(const std::vector weights)

Sets all weights from an array

template <int index>
		Tensor<F, get_dim<index, wn...>()> &get_weight()

Returns a reference to a specific weight described by its index

template <OptimizerFactory Fac> void generate_optimizer(Fac factory)

Creates an optimizer for each weight with the methods of the provided

OptimizerFactory

template <typename T, unsigned int dim>
		void optimize_weights(const Tensor &error)

Calculates the gradients of each weight to the

error

tensor and optimizes them by their gradient with their optimizer (if one has been generated, see

generate_optimizer()

)

std::vector<FGraphNode *> collect_weights()

Collects pointer to the underlying

FGraphNode

references of the weights. Usefull for gradient calculation.

void optimize_weights(std::vector grads)

Takes already calculated Gradients of the weights (

ǹ

th entry in

grads

correspons to the

th weight) and optimizes them by their gradient with their optimizer (if one has been generated, see

generate_optimizer()

)

virtual std::string name()

Returns the name of this Layer for overviews and debugging.

virtual std::string description()

Returns a summary of this Layer for overviews and debugging.

dl/layers/*

Fully Connected
Convolution and Pooling
Normalization

dl/layers/connected.hpp

Overview

Types and Functions

• template <typename F = float> struct Connected : public Layer
• template <Initializer InitWeights, Initializer InitBias> Connected(size_t units_in, size_t units_out, InitWeights init_weights, InitBias init_bias) : Layer( Flint::concat(init_weights.template initialize( std::array
• Connected(size_t units_in, size_t units_out) : Layer(Flint::concat( GlorotUniform().template initialize( std::array

template <typename F = float> struct Connected : public Layer

Layer for fully connected neuronal network layer. A connected layer has a 2 dimensional matrix and a bias as parameters. The matrix is multiplied (with matrix multiplication) with the last two dimensions of the input tensor. The bias is added on the result (in practice this happens in one matrix multiplication, the input tensor is padded with a 1 in its last dimension and the bias is the last row of the matrix).

template <Initializer InitWeights, Initializer InitBias>
		Connected(size_t units_in, size_t units_out, InitWeights init_weights,
				  InitBias init_bias)
			: Layer(
				  Flint::concat(init_weights.template initialize(
									std::array

Creates the layer and initializes the weights.

```
units_in
```
size of the last dimension of the input tensors (will be

the size of the dimension before the last dimension of the weights).

```
units_out
```
size of the last dimension the result tensor is

supposed to have (will be the size of the last dimension of the weights).

```
init_weights
```
a weight initializer (has to fulfill the

Initializer

concept, close to Gauss-distributed random values yield good results).

```
init_bias
```
a bias initializer (has to fulfill the
```
Initializer
```

concept, small values yield good results, can be constant for bias).

Connected(size_t units_in, size_t units_out)
			: Layer(Flint::concat(
				  GlorotUniform().template initialize(
					  std::array

Creates the layer and initializes the weights.

```
units_in
```
size of the last dimension of the input tensors (will be

the size of the dimension before the last dimension of the weights).

```
units_out
```
size of the last dimension the result tensor is

supposed to have (will be the size of the last dimension of the weights).

The weights are initialized with glorot uniform random values and the bias with 0s.

dl/layers/convolution.hpp

Overview

Types and Functions

• enum PaddingMode
• template <int n, typename F = float> class Convolution : public Layer
• template <Initializer InitWeights, Initializer InitBias> Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, InitWeights weight_init, InitBias bias_init, std::array stride, PaddingMode padding_mode = NO_PADDING) : Layer(weight_init.template initialize( weight_shape(filters, kernel_size, units_in)), bias_init.template initialize( std::array
• Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, std::array stride, PaddingMode padding_mode = NO_PADDING) : Layer(GlorotUniform().template initialize( weight_shape(filters, kernel_size, units_in)), ConstantInitializer().template initialize( std::array
• typedef Convolution<4> Conv2D

enum PaddingMode

Padding of convolution operations

```
NO_PADDING
```
: a normal convolution operation. Each filter is slid over the input tensor with its step size as many times as it completly fits into the input tensor. The output may have a smaller size then the input.
```
SAME_PADDING
```
: the image tensor is padded on each side as equally as possible so that the output has the same size as the input if steps = 1 in all dimensions (i.e. the image is padded so that the kernels fit fully into the image)
```
FULL_PADDING
```
: the image tensor is padded on each side by the size of the kernel - 1 in that dimension. This yields as many kernel multiplications as possible with the given step size

template <int n, typename F = float> class Convolution : public Layer

A generic Convolution layer. It creates multiple filters that are slid along the input in each dimension by a step size. Each time the filter values are multiplied with the elements of the input tensor with which it is currently aligned and the result (with shape of the filter) is summed up to a single value in the resulting tensor. After that the filter is moved by its step size in each dimension and the process repeats. After the convolution is calculated a learnable bias is added to the result per filter.

TLDR; Each element in the result of this layer is a full multiplication of a filter with a corresponding window in the input array. This is especially helpful for image processing tasks, since the parameters (filters) allow the recognize location independent features in the input.

You are supposed to configure this layer by providing a number of

filters

, a

kernel_size

and the size of the last dimension of the input tensor, i.e. the channels of the input tensor called

units_in

. The template expects you to provide the dimensionality of the input tensor (including batch size and channels). The output size is the same as of the input tensor for the first dimension (usually the

batch_size

), in the last dimension it is the number of

filters

and in every other the number of times each filter can be slid against the input tensor (depending on the size of the input tensor, the

kernel_size

, the step size and padding see

PaddingMode

E.g. if you have a batch of two dimensional rgb (3 channels) images, it would have a shape of

(batch_size, height, width, 3)

. Then you would create a

Convolution<4>

layer (also called

Conv2D

) with

units_in = 3

. The output tensor would also be a 4 dimensional tensor. Lets say you dont use padding (

NO_PADDING

), 10 filters, a step size of 2 in each dimension, 32 as

kernel_size

and your 100 images have widths and heights of

(

input_shape = (100, 128, 128, 3)

). The output size would be

(batch_size, ceil((input_shape - kernel_size + 1) / steps),
   ceil((input_shape - kernel_size + 1) / steps), filters) =
   (100, 49, 49, 10)

template <Initializer InitWeights, Initializer InitBias>
		Convolution(size_t units_in, unsigned int filters,
					unsigned int kernel_size, InitWeights weight_init,
					InitBias bias_init, std::array stride,
					PaddingMode padding_mode = NO_PADDING)
			: Layer(weight_init.template initialize(
								 weight_shape(filters, kernel_size, units_in)),
							 bias_init.template initialize(
								 std::array

Initializes the Convolution Layer.

```
units_in
```
number of channels (size of last dimension) of input

tensor

```
filters
```
number of used filters (size of last dimension of the

result tensor)

```
kernel_size
```
size of filters
```
weight_init
```
Initializer for filters, has to implement the

Initializer

concept, should generate random values close to a normal distribution

```
bias_init
```
Initializer for the bias, has to implement the

Initializer

concept, should generate small values, constant values like

are fine

```
stride
```
step size per dimension (2 dimensions less then the input tensor, since the convolution is broadcasted along the

batch_size

and the channels in the last dimension are fully reduced)

```
padding_mode
```
which type of padding to use (see
```
PaddingMode
```
for

more information)

Convolution(size_t units_in, unsigned int filters,
					unsigned int kernel_size,
					std::array stride,
					PaddingMode padding_mode = NO_PADDING)
			: Layer(GlorotUniform().template initialize(
								 weight_shape(filters, kernel_size, units_in)),
							 ConstantInitializer().template initialize(
								 std::array

Initializes the Convolution Layer.

```
units_in
```
number of channels (size of last dimension) of input

tensor

```
filters
```
number of used filters (size of last dimension of the

result tensor)

```
kernel_size
```
size of filters
```
stride
```
step size per dimension (2 dimensions less then the input tensor, since the convolution is broadcasted along the

batch_size

and the channels in the last dimension are fully reduced)

```
padding_mode
```
which type of padding to use (see
```
PaddingMode
```
for

more information)

The filters are initialized with a glorot uniform distribution.

typedef Convolution<4> Conv2D

For inputs of images with shape

(batch_size, width, height, channels)

dl/layers/normalization.hpp

Overview

Types and Functions

• class Dropout : public UntrainableLayer

class Dropout : public UntrainableLayer

Randomly sets some values in the input to 0 with a probability of

. Reduces over fitting. Degenerates to an identity function when

training

is false.

dl/activations.hpp

Overview

Types and Functions

• class SoftMax : public UntrainableLayer
• SoftMax(int ax = -1) : ax(ax)
• struct Relu : public UntrainableLayer

class SoftMax : public UntrainableLayer

SoftMax activation Layer. For multiclass classification.

SoftMax(int ax = -1) : ax(ax)

Initializes the SoftMax function with an optional axis parameter that describes the dimension of which the sum will be taken (may be negative in which case it will index from back, i.e. -1 means the last axis, -2 the one befor the last etc.). Calculates

exp(in) /
 sum(in, ax)

struct Relu : public UntrainableLayer

Rectified Linear Unit. Does

max(input, 0)

. Simple and it works.

Documentation dl/layers.hpp dl/layers/* dl/activations.hpp

Flint's C++ Deep Learning Framework

dl/layers.hpp

dl/layers/*

dl/layers/connected.hpp

dl/layers/convolution.hpp

dl/layers/normalization.hpp

dl/activations.hpp