dl/optimizers.hpp
Overview
Types and Functions • template <int n, typename F = float> struct Optimizer
• virtual Tensor<F, n> update(Tensor &weights,
Tensor &gradient) = 0
• template <typename T> concept OptimizerFactory = requires(T fac)
• template <int n, typename F = float> struct Adam : public Optimizer
• Adam(F learning_rate = 0.0015, F b1 = 0.9, F b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
• struct AdamFactory
• AdamFactory(double learning_rate = 0.0015, double b1 = 0.9, double b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
• template <int n> Optimizer<n> *generate_optimizer() const
• virtual Tensor<F, n> update(Tensor
• template <typename T> concept OptimizerFactory = requires(T fac)
• template <int n, typename F = float> struct Adam : public Optimizer
• Adam(F learning_rate = 0.0015, F b1 = 0.9, F b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
• struct AdamFactory
• AdamFactory(double learning_rate = 0.0015, double b1 = 0.9, double b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
• template <int n> Optimizer<n> *generate_optimizer() const
template <int n, typename F = float> struct Optimizer
Optimizer interface that defines an update method.
An optimizer is intended to be instantiated once per weight
and optimizes double or flaot weights.
The type-parameter
ndenotes the dimensionality of the weight this optimizer was generated for.
virtual Tensor<F, n> update(Tensor&weights, Tensor &gradient) = 0
Takes the old weight and its gradient to the error tensor and updates
it, i.e. returns the updated version of the weight.
template <typename T> concept OptimizerFactory = requires(T fac)
An OptimizerFactory is used to generate optimizers on the heap with
predefined parameters. Needed so a new optimizer per weight can be generated.
For each derivation of
Optimizerthere should be one factory to generate instances of that optimizers for the weights.
template <int n, typename F = float> struct Adam : public Optimizer
Implementation of the Adam algorithm (first-order gradient-based optimizer
for stochastic objective functions based on adaptive estimates of lower-order
moments).
Adam(F learning_rate = 0.0015, F b1 = 0.9, F b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
Initializes the Adam algorithm with some parameters that influence
the optimization speed and accuracy.
-
learning_rate: (sometimes called
alpha) the step size per optimization, i.e. the proportion weights are updated. Higher values (e.g. 0.2) lead to a faster convergence, while lower values yield more accurate convergence. -
b1: (sometimes called
beta1) the exponential decay rate for the first moment estimates. -
b2: (sometimes called
beta2) the exponential decay rate for the second moment estimates. You can tune the individual members later on too.
struct AdamFactory
Constructs Adam Optimizer with preset parameters.
AdamFactory(double learning_rate = 0.0015, double b1 = 0.9, double b2 = 0.999) : learning_rate(learning_rate), b1(b1), b2(b2)
Initialisation parameters for the Adam algorithm that influence the
optimization speed and accuracy.
-
learning_rate: (sometimes called
alpha) the step size per optimization, i.e. the proportion weights are updated. Higher values (e.g. 0.2) lead to a faster convergence, while lower values yield more accurate convergence. -
b1: (sometimes called
beta1) the exponential decay rate for the first moment estimates. -
b2: (sometimes called
beta2) the exponential decay rate for the second moment estimates. All Adam instances generated by
generate_optimizerare constructed with the given parameters.
template <int n> Optimizer<n> *generate_optimizer() const
Generates an Adam optimizer for a
n-dimensional weight.
dl/losses.hpp
Overview
Types and Functions • template <typename T=float>
concept GenericLoss = requires(T a, Tensor &t1, Tensor &t2,
Tensor &t3, Tensor &t4)
• struct CrossEntropyLoss
• struct CrossEntropyLoss
template <typename T=float> concept GenericLoss = requires(T a, Tensor&t1, Tensor &t2, Tensor &t3, Tensor &t4)
Defines the general concept of a Loss function.
It receives two tensors: the actual output and the expected one.
It then calculates the loss as a double Tensor (since the weights are always
double Tensors as well).
struct CrossEntropyLoss
Calculates the Categorical Cross Entropy Loss with full summation. It is
advised to apply a softmax as the last activation layer in the calculation of
in. Calculates:
sum(-expected * log(in))