dl/layers.hpp
Overview
Types and Functions • template <typename T, unsigned int index, int... w> class WeightRef
• template <typename T> concept GenericLayer = requires(T a, Tensor &t1, Tensor &t2,
Tensor &t3, Tensor &t4, AdamFactory fac,
std::vector grads)
• struct UntrainableLayer
• template <typename F, int... wn> class Layer
• template <typename... args> Layer(args... weights)
• template <int index, int dim> void set_weight(Tensor t)
• void set_weights(const std::vector weights)
• template <int index> Tensor<F, get_dim<index, wn...>()> &get_weight()
• template <OptimizerFactory Fac> void generate_optimizer(Fac factory)
• template <typename T, unsigned int dim> void optimize_weights(const Tensor &error)
• std::vector<FGraphNode *> collect_weights()
• void optimize_weights(std::vector grads)
• virtual std::string name()
• virtual std::string description()
• template <typename T> concept GenericLayer = requires(T a, Tensor
• struct UntrainableLayer
• template <typename F, int... wn> class Layer
• template <typename... args> Layer(args... weights)
• template <int index, int dim> void set_weight(Tensor
• void set_weights(const std::vector
• template <int index> Tensor<F, get_dim<index, wn...>()> &get_weight()
• template <OptimizerFactory Fac> void generate_optimizer(Fac factory)
• template <typename T, unsigned int dim> void optimize_weights(const Tensor
• std::vector<FGraphNode *> collect_weights()
• void optimize_weights(std::vector
• virtual std::string name()
• virtual std::string description()
template <typename T, unsigned int index, int... w> class WeightRef
FOR INTERNAL USE ONLY
builds an compile-time linked list of Tensor pointer
template <typename T> concept GenericLayer = requires(T a, Tensor&t1, Tensor &t2, Tensor &t3, Tensor &t4, AdamFactory fac, std::vector grads)
Concept of methods a Layer for neural networks has to implement.
Mind the static constexpr methods that determine the modifications of
dimensionality and types of the input tensors
int transform_dimensionality(int)and
FType transform_type(FType), they describe the type of your forward (i.e. if a tensor of dimensionality
nand type
Tis inserted into your forward, a tensor of dimensionality
transform_dimensionality(n)and type
transform_type(T)should be returned). It is highly recommended to derive your Layer from
UntrainableLayeror
Layer, since they provide already implementations for some methods.
forwardmay consume its input tensor since it isn't needed afterwards.
struct UntrainableLayer
Implements blank methods for every method of GenericLayer that is not needed
for a Layer that is not trainable.
If you derive from this class you have to implement the
forwardmethod from the
GenericLayerconcept and - if the forward outputs another type or dimensionality then its parameter has - overload
transform_typeand
transform_dimensionality.
template <typename F, int... wn> class Layer
Virtual super class of all Layer implementations with type safe weight
management capabilities. The variadic template describes the dimensionality
of the individual weights i.e. a
Layer<double, 3,4,5>has three weights:
Tensor<double, 3>,
Tensor<double, 4>,
Tensor<double, 5>. You have to initialize them by providing their initial state in the constructor, after that you may access references to them with the function
get_weight<int index>(). If you derive from this class you have to implement the
forwardmethod from the
GenericLayerconcept and - if the
forwardoutputs another type or dimensionality then its parameter has - overload
transform_typeand
transform_dimensionality.
template <typename... args> Layer(args... weights)
Initializes the weights by copying the provided ones.
After that you may access them with
get_weight<int index>().
template <int index, int dim> void set_weight(Tensort)
Sets a specific weight described by its index
void set_weights(const std::vectorweights)
Sets all weights from an array
template <int index> Tensor<F, get_dim<index, wn...>()> &get_weight()
Returns a reference to a specific weight described by its index
template <OptimizerFactory Fac> void generate_optimizer(Fac factory)
Creates an optimizer for each weight with the methods of the
provided
OptimizerFactory
template <typename T, unsigned int dim> void optimize_weights(const Tensor&error)
Calculates the gradients of each weight to the
errortensor and optimizes them by their gradient with their optimizer (if one has been generated, see
generate_optimizer())
std::vector<FGraphNode *> collect_weights()
Collects pointer to the underlying
FGraphNodereferences of the weights. Usefull for gradient calculation.
void optimize_weights(std::vectorgrads)
Takes already calculated Gradients of the weights (
ǹth entry in
gradscorrespons to the
nth weight) and optimizes them by their gradient with their optimizer (if one has been generated, see
generate_optimizer())
virtual std::string name()
Returns the name of this Layer for overviews and debugging.
virtual std::string description()
Returns a summary of this Layer for overviews and debugging.
dl/layers/*
dl/layers/connected.hpp
Overview
Types and Functions • template <typename F = float> struct Connected : public Layer
• template <Initializer InitWeights, Initializer InitBias> Connected(size_t units_in, size_t units_out, InitWeights init_weights, InitBias init_bias) : Layer(
Flint::concat(init_weights.template initialize(
std::array
• Connected(size_t units_in, size_t units_out) : Layer(Flint::concat(
GlorotUniform().template initialize(
std::array
• template <Initializer InitWeights, Initializer InitBias> Connected(size_t units_in, size_t units_out, InitWeights init_weights, InitBias init_bias) : Layer
• Connected(size_t units_in, size_t units_out) : Layer
template <typename F = float> struct Connected : public Layer
Layer for fully connected neuronal network layer.
A connected layer has a 2 dimensional matrix and a bias as parameters.
The matrix is multiplied (with matrix multiplication) with the last two
dimensions of the input tensor. The bias is added on the result (in practice
this happens in one matrix multiplication, the input tensor is padded with a
1 in its last dimension and the bias is the last row of the matrix).
template <Initializer InitWeights, Initializer InitBias> Connected(size_t units_in, size_t units_out, InitWeights init_weights, InitBias init_bias) : Layer( Flint::concat(init_weights.template initialize ( std::array
Creates the layer and initializes the weights.
units_in
size of the last dimension of the input tensors (will be
units_out
size of the last dimension the result tensor is
init_weights
a weight initializer (has to fulfill the
Initializerconcept, close to Gauss-distributed random values yield good results).
init_bias
a bias initializer (has to fulfill theInitializer
Connected(size_t units_in, size_t units_out) : Layer(Flint::concat( GlorotUniform().template initialize ( std::array
Creates the layer and initializes the weights.
units_in
size of the last dimension of the input tensors (will be
units_out
size of the last dimension the result tensor is
dl/layers/convolution.hpp
Overview
Types and Functions • enum PaddingMode
• template <int n, typename F = float> class Convolution : public Layer
• template <Initializer InitWeights, Initializer InitBias> Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, InitWeights weight_init, InitBias bias_init, std::array stride,
PaddingMode padding_mode = NO_PADDING)
: Layer(weight_init.template initialize(
weight_shape(filters, kernel_size, units_in)),
bias_init.template initialize(
std::array
• Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, std::array stride,
PaddingMode padding_mode = NO_PADDING)
: Layer(GlorotUniform().template initialize(
weight_shape(filters, kernel_size, units_in)),
ConstantInitializer().template initialize(
std::array
• typedef Convolution<4> Conv2D
• template <int n, typename F = float> class Convolution : public Layer
• template <Initializer InitWeights, Initializer InitBias> Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, InitWeights weight_init, InitBias bias_init, std::array
• Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, std::array
• typedef Convolution<4> Conv2D
enum PaddingMode
Padding of convolution operations
NO_PADDING
: a normal convolution operation. Each filter is slid over the input tensor with its step size as many times as it completly fits into the input tensor. The output may have a smaller size then the input.SAME_PADDING
: the image tensor is padded on each side as equally as possible so that the output has the same size as the input if steps = 1 in all dimensions (i.e. the image is padded so that the kernels fit fully into the image)FULL_PADDING
: the image tensor is padded on each side by the size of the kernel - 1 in that dimension. This yields as many kernel multiplications as possible with the given step size
template <int n, typename F = float> class Convolution : public Layer
A generic Convolution layer. It creates multiple filters that are slid along
the input in each dimension by a step size. Each time the filter values are
multiplied with the elements of the input tensor with which it is currently
aligned and the result (with shape of the filter) is summed up to a single
value in the resulting tensor. After that the filter is moved by its step
size in each dimension and the process repeats. After the convolution is
calculated a learnable bias is added to the result per filter.
TLDR; Each element in the result of this layer is a full multiplication of a
filter with a corresponding window in the input array. This is especially
helpful for image processing tasks, since the parameters (filters) allow the
recognize location independent features in the input.
You are supposed to configure this layer by providing a number of
filters, a
kernel_sizeand the size of the last dimension of the input tensor, i.e. the channels of the input tensor called
units_in. The template expects you to provide the dimensionality of the input tensor (including batch size and channels). The output size is the same as of the input tensor for the first dimension (usually the
batch_size), in the last dimension it is the number of
filtersand in every other the number of times each filter can be slid against the input tensor (depending on the size of the input tensor, the
kernel_size, the step size and padding see
PaddingMode). E.g. if you have a batch of two dimensional rgb (3 channels) images, it would have a shape of
(batch_size, height, width, 3). Then you would create a
Convolution<4>layer (also called
Conv2D) with
units_in = 3. The output tensor would also be a 4 dimensional tensor. Lets say you dont use padding (
NO_PADDING), 10 filters, a step size of 2 in each dimension, 32 as
kernel_sizeand your 100 images have widths and heights of
128(
input_shape = (100, 128, 128, 3)). The output size would be
(batch_size, ceil((input_shape - kernel_size + 1) / steps), ceil((input_shape - kernel_size + 1) / steps), filters) = (100, 49, 49, 10).
template <Initializer InitWeights, Initializer InitBias> Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, InitWeights weight_init, InitBias bias_init, std::arraystride, PaddingMode padding_mode = NO_PADDING) : Layer (weight_init.template initialize ( weight_shape(filters, kernel_size, units_in)), bias_init.template initialize ( std::array
Initializes the Convolution Layer.
units_in
number of channels (size of last dimension) of input
filters
number of used filters (size of last dimension of the
kernel_size
size of filtersweight_init
Initializer for filters, has to implement the
Initializerconcept, should generate random values close to a normal distribution
bias_init
Initializer for the bias, has to implement the
Initializerconcept, should generate small values, constant values like
0are fine
stride
step size per dimension (2 dimensions less then the input tensor, since the convolution is broadcasted along the
batch_sizeand the channels in the last dimension are fully reduced)
padding_mode
which type of padding to use (seePaddingMode
for
Convolution(size_t units_in, unsigned int filters, unsigned int kernel_size, std::arraystride, PaddingMode padding_mode = NO_PADDING) : Layer (GlorotUniform().template initialize ( weight_shape(filters, kernel_size, units_in)), ConstantInitializer().template initialize ( std::array
Initializes the Convolution Layer.
units_in
number of channels (size of last dimension) of input
filters
number of used filters (size of last dimension of the
kernel_size
size of filtersstride
step size per dimension (2 dimensions less then the input tensor, since the convolution is broadcasted along the
batch_sizeand the channels in the last dimension are fully reduced)
padding_mode
which type of padding to use (seePaddingMode
for
typedef Convolution<4> Conv2D
For inputs of images with shape
(batch_size, width, height, channels)
dl/layers/normalization.hpp
Overview
class Dropout : public UntrainableLayer
Randomly sets some values in the input to 0 with a probability of
p. Reduces over fitting. Degenerates to an identity function when
trainingis false.
dl/activations.hpp
Overview
Types and Functions • class SoftMax : public UntrainableLayer
• SoftMax(int ax = -1) : ax(ax)
• struct Relu : public UntrainableLayer
• SoftMax(int ax = -1) : ax(ax)
• struct Relu : public UntrainableLayer
class SoftMax : public UntrainableLayer
SoftMax activation Layer. For multiclass classification.
SoftMax(int ax = -1) : ax(ax)
Initializes the SoftMax function with an optional axis parameter
that describes the dimension of which the sum will be taken (may be
negative in which case it will index from back, i.e. -1 means the
last axis, -2 the one befor the last etc.). Calculates
exp(in) / sum(in, ax)
struct Relu : public UntrainableLayer
Rectified Linear Unit. Does
max(input, 0). Simple and it works.