Build models with
GraphModeland the
SequentialBuilderhelper, or load ONNX models with
GraphModel::load_model. Training is handled by
Trainertogether with a
DataLoader, an
Optimizerand a
LossFunction.
src/dl/model.hpp
Overview
Types and Functions • struct GraphModel
• FGraphNode *operator()( FGraphNode *in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
• std::vector<FGraphNode *> operator()( std::vector<FGraphNode *> in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
• std::string serialize_onnx()
• std::vector<std::vector<size_t>> shape_interference(std::vector<std::vector<size_t>> input_shapes)
• static GraphModel *load_model(std::string path)
• static GraphModel *sequential(std::vector<LayerGraph *> list)
• static GraphModel *from_output(LayerGraph *output)
• static SequentialBuilder builder()
• struct SequentialBuilder
• SequentialBuilder &add(LayerGraph *layer)
• SequentialBuilder &conv2d(size_t filters, std::array<size_t, 2> kernel_size, size_t in_channels, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
• SequentialBuilder &dense(size_t in_units, size_t out_units, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
• SequentialBuilder &maxpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
• SequentialBuilder &avgpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
• SequentialBuilder &flatten()
• SequentialBuilder &relu()
• SequentialBuilder &softmax(int axis = -1)
• SequentialBuilder &dropout(float probability)
• GraphModel *build()
• FGraphNode *operator()( FGraphNode *in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
• std::vector<FGraphNode *> operator()( std::vector<FGraphNode *> in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
• std::string serialize_onnx()
• std::vector<std::vector<size_t>> shape_interference(std::vector<std::vector<size_t>> input_shapes)
• static GraphModel *load_model(std::string path)
• static GraphModel *sequential(std::vector<LayerGraph *> list)
• static GraphModel *from_output(LayerGraph *output)
• static SequentialBuilder builder()
• struct SequentialBuilder
• SequentialBuilder &add(LayerGraph *layer)
• SequentialBuilder &conv2d(size_t filters, std::array<size_t, 2> kernel_size, size_t in_channels, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
• SequentialBuilder &dense(size_t in_units, size_t out_units, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
• SequentialBuilder &maxpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
• SequentialBuilder &avgpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
• SequentialBuilder &flatten()
• SequentialBuilder &relu()
• SequentialBuilder &softmax(int axis = -1)
• SequentialBuilder &dropout(float probability)
• GraphModel *build()
struct GraphModel
A Model for neural networks that represent the connections between the
layers as an acyclic graph. This allows arbitrary topology of the model.
In- and Export is implemented for the ONNX specification.
FGraphNode *operator()( FGraphNode *in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
Feeds the single input tensor through the model and returns a single
output. This function should be used if you have a model with exactly
one input and one output tensor. The input node is only preserved if
its reference counter is >= 1. The output node has a reference
counter of 0 (so if it is used in further calculations, its reference
counter should be incremented). If
time_per_layeris provided, it is filled with per layer execution time in nanoseconds.
std::vector<FGraphNode *> operator()( std::vector<FGraphNode *> in, std::optional<std::reference_wrapper<std::map<LayerGraph *, long>>> time_per_layer = std::nullopt)
Feeds all input tensors through the model and returns the output
tensors. The input nodes are only preserved if its reference counter
is >= 1. The output node has a reference counter of 0 (so if it is
used in further calculations, its reference counter should be
incremented). If
time_per_layeris provided, it is filled with per layer execution time in nanoseconds.
std::string serialize_onnx()
Serializes the model into an ONNX string.
std::vector<std::vector<size_t>> shape_interference(std::vector<std::vector<size_t>> input_shapes)
Infers all output shapes for the given input shapes. The returned
shapes are ordered like the output nodes of the model.
static GraphModel *load_model(std::string path)
Loads an ONNX model from disk.
static GraphModel *sequential(std::vector<LayerGraph *> list)
Builds a sequential model from the given list of layers.
static GraphModel *from_output(LayerGraph *output)
Builds a model by tracing all incoming layers of one output node.
static SequentialBuilder builder()
Returns a SequentialBuilder helper for fluent model construction.
struct SequentialBuilder
For building a sequential (layer following layer) GraphModel
SequentialBuilder &add(LayerGraph *layer)
Adds an arbitrary layer next
SequentialBuilder &conv2d(size_t filters, std::array<size_t, 2> kernel_size, size_t in_channels, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
Constructs and adds a 2D Convolution Layer with a default
Conv2DOptions struct.
filters
gives the number of filters to use (or "number of
kernel_size
gives the shape of the individual filter kernels (2D)in_channels
gives the number of channels in the input tensoractivation
adds an activation layer after the convolution
SequentialBuilder &dense(size_t in_units, size_t out_units, FlintDL::ActivationKind activation = FlintDL::ActivationKind::None)
Constructs and adds a fully connected layer with a default
DenseOptions struct.
SequentialBuilder &maxpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
Adds a max pooling layer.
SequentialBuilder &avgpool2d( std::array<size_t, 2> kernel_size, const FlintDL::Pool2DOptions &options = FlintDL::Pool2DOptions())
Adds an average pooling layer.
SequentialBuilder &flatten()
Adds a flatten layer.
SequentialBuilder &relu()
Adds a ReLU activation layer.
SequentialBuilder &softmax(int axis = -1)
Adds a softmax activation layer.
SequentialBuilder &dropout(float probability)
Adds a dropout layer.
GraphModel *build()
Builds the sequential GraphModel.
src/dl/trainer.hpp
Overview
Types and Functions • struct MetricInfo
• int batch = 0
• int epoch = 0
• size_t total_batches = 0
• size_t total_epochs = 0
• double last_batch_error = 0.0
• double last_epoch_error = 0.0
• double last_validation_error = 0.0
• double gradient_time_ns = 0.0
• std::vector<std::pair<std::string, double>> time_per_layer_ns
• struct ControlInformation
• class ReporterControlInformation : public ControlInformation
• struct MetricReporter
• ControlInformation &control_information() const
• virtual void model_description(std::vector<std::string> layer_names, std::vector<std::string> layer_descriptions, std::vector<size_t> number_parameters, std::string loss_fct, std::string optimizer_name, std::string optimizer_desc)
• class CLIReporter : public MetricReporter
• class NetworkMetricReporter : public MetricReporter
• struct DataLoader
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> next_batch() = 0
• virtual size_t remaining_for_epoch() = 0
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> validation_batch() = 0
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> testing_data() = 0
• virtual size_t total_batches() const = 0
• class StaticLoader : public DataLoader
• struct IDXFormatLoader : public DataLoader
• IDXFormatLoader(size_t batch_size, std::string train_images_path, std::string train_labels_path, std::string test_images_path = "", std::string test_labels_path = "", double validation_percentage = 0.15) : DataLoader(batch_size), train_images_path(train_images_path), train_labels_path(train_labels_path), test_images_path(test_images_path), test_labels_path(test_labels_path), validation_percentage(validation_percentage)
• struct Optimizer
• virtual FGraphNode *optimize(FGraphNode *weight, FGraphNode *gradient) = 0
• virtual std::string name() const
• virtual std::string description() const
• virtual FGraphNode *calculate_loss(FGraphNode *actual, FGraphNode *expected) = 0
• virtual std::string name() const
• virtual std::string description() const
• struct CrossEntropyLoss : public LossFunction
• bool is_epoch
• double training_loss
• double validation_loss
• double training_time_ms
• double validation_time_ms
• double avg_batch_time_ms
• std::vector<std::pair<std::string, double>> avg_batch_time_per_layer_ms
• Trainer(GraphModel *model, DataLoader *dl, Optimizer *opt, LossFunction *loss) : model(model), data(dl), optimizer(opt), loss(loss)
• Trainer(GraphModel *model) : model(model)
• void enable_early_stopping(double error)
• void set_data_loader(DataLoader *dl)
• void set_optimizer(Optimizer *opt)
• void set_loss(LossFunction *loss)
• void set_metric_reporter(MetricReporter *reporter)
• TrainingMetrics train_epoch()
• void train(size_t epochs)
• int batch = 0
• int epoch = 0
• size_t total_batches = 0
• size_t total_epochs = 0
• double last_batch_error = 0.0
• double last_epoch_error = 0.0
• double last_validation_error = 0.0
• double gradient_time_ns = 0.0
• std::vector<std::pair<std::string, double>> time_per_layer_ns
• struct ControlInformation
• class ReporterControlInformation : public ControlInformation
• struct MetricReporter
• ControlInformation &control_information() const
• virtual void model_description(std::vector<std::string> layer_names, std::vector<std::string> layer_descriptions, std::vector<size_t> number_parameters, std::string loss_fct, std::string optimizer_name, std::string optimizer_desc)
• class CLIReporter : public MetricReporter
• class NetworkMetricReporter : public MetricReporter
• struct DataLoader
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> next_batch() = 0
• virtual size_t remaining_for_epoch() = 0
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> validation_batch() = 0
• virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> testing_data() = 0
• virtual size_t total_batches() const = 0
• class StaticLoader : public DataLoader
• struct IDXFormatLoader : public DataLoader
• IDXFormatLoader(size_t batch_size, std::string train_images_path, std::string train_labels_path, std::string test_images_path = "", std::string test_labels_path = "", double validation_percentage = 0.15) : DataLoader(batch_size), train_images_path(train_images_path), train_labels_path(train_labels_path), test_images_path(test_images_path), test_labels_path(test_labels_path), validation_percentage(validation_percentage)
• struct Optimizer
• virtual FGraphNode *optimize(FGraphNode *weight, FGraphNode *gradient) = 0
• virtual std::string name() const
• virtual std::string description() const
• virtual FGraphNode *calculate_loss(FGraphNode *actual, FGraphNode *expected) = 0
• virtual std::string name() const
• virtual std::string description() const
• struct CrossEntropyLoss : public LossFunction
• bool is_epoch
• double training_loss
• double validation_loss
• double training_time_ms
• double validation_time_ms
• double avg_batch_time_ms
• std::vector<std::pair<std::string, double>> avg_batch_time_per_layer_ms
• Trainer(GraphModel *model, DataLoader *dl, Optimizer *opt, LossFunction *loss) : model(model), data(dl), optimizer(opt), loss(loss)
• Trainer(GraphModel *model) : model(model)
• void enable_early_stopping(double error)
• void set_data_loader(DataLoader *dl)
• void set_optimizer(Optimizer *opt)
• void set_loss(LossFunction *loss)
• void set_metric_reporter(MetricReporter *reporter)
• TrainingMetrics train_epoch()
• void train(size_t epochs)
struct MetricInfo
Information about the current training process, passed to MetricReporters.
int batch = 0
The current batch (1-based).
int epoch = 0
The current epoch (1-based).
size_t total_batches = 0
The number of batches in this epoch.
size_t total_epochs = 0
The number of epochs requested for this training run.
double last_batch_error = 0.0
The error of the last batch.
double last_epoch_error = 0.0
The average error of the epoch.
double last_validation_error = 0.0
The validation error after the epoch.
double gradient_time_ns = 0.0
Time for gradient calculation for this batch in nanoseconds.
std::vector<std::pair<std::string, double>> time_per_layer_ns
Time per layer for this batch in nanoseconds.
struct ControlInformation
Control interface for the training process that can be used by reporters.
class ReporterControlInformation : public ControlInformation
Thread-safe default implementation for ControlInformation.
struct MetricReporter
Receives metrics during training and exposes control information.
ControlInformation &control_information() const
Access to the mutable control information.
virtual void model_description(std::vector<std::string> layer_names, std::vector<std::string> layer_descriptions, std::vector<size_t> number_parameters, std::string loss_fct, std::string optimizer_name, std::string optimizer_desc)
Receives a description of the model (layer overview, loss,
optimizer).
class CLIReporter : public MetricReporter
Default reporter that prints to stdout.
class NetworkMetricReporter : public MetricReporter
Sends training data over a REST API for HTTP connections on port 5111.
For API documentation see
dl/visualization/README.md.
struct DataLoader
Loads the Data for the training process
virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> next_batch() = 0
Loads the next batch and returns it as a pair of model input and
expected output. I.e. the returned pair is a tuple, where the first
entry describes the input values for the model (each entry in the
vector is a batch-sized input for the model, the vector is used for
models that have multiple inputs. If your model just has one, return
a 1-element vector) and the second the output values that are
expected.
virtual size_t remaining_for_epoch() = 0
Used to determine how many elements (not batches!) are still to be
processed to finish the epoch. Return 0 if the epoch is finished.
Used for metrics and determining if validation can be run and the
next epoch started.
virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> validation_batch() = 0
Returns the data for the validation. Same semantic as for
next_batch.
virtual std::pair<std::vector<FGraphNode *>, std::vector<FGraphNode *>> testing_data() = 0
Return the complete training dataset, used for testing after
training
virtual size_t total_batches() const = 0
Return the number of batches this/an epoch has
class StaticLoader : public DataLoader
Loads data from pre-existing tensors.
struct IDXFormatLoader : public DataLoader
DataLoader for the MNIST dataset.
TODO don't prefetch data but load lazy when needed if > 6GB something
-> maybe derive second class for something like that
IDXFormatLoader(size_t batch_size, std::string train_images_path, std::string train_labels_path, std::string test_images_path = "", std::string test_labels_path = "", double validation_percentage = 0.15) : DataLoader(batch_size), train_images_path(train_images_path), train_labels_path(train_labels_path), test_images_path(test_images_path), test_labels_path(test_labels_path), validation_percentage(validation_percentage)
Sets the batch size, the paths to the train and test data
and the validation percentage. The validation percentage is
the percentage of the training data that is split to validate
the error after each training epoch.
struct Optimizer
Interface to optimize variables.
For each Variable an Optimizer is created and managed.
virtual FGraphNode *optimize(FGraphNode *weight, FGraphNode *gradient) = 0
Updates the weight regarding its gradient or derivation.
weightis the variable and
gradientthe gradient. Returns the new variable, the old one will be replaced. Do not set the reference counter as this is done by the trainer.
virtual std::string name() const
Human readable optimizer name used in model reports.
virtual std::string description() const
Human readable optimizer description used in model reports.
virtual FGraphNode *calculate_loss(FGraphNode *actual, FGraphNode *expected) = 0
Calculates the loss between the actual output of the model
and the expected output from the trainings data.
virtual std::string name() const
Human readable loss name used in model reports.
virtual std::string description() const
Human readable loss description used in model reports.
struct CrossEntropyLoss : public LossFunction
Calculates the Categorical Cross Entropy Loss with full summation.
It is advised to apply a softmax as the last activation layer in the
calculation of
in. Calculates:
sum(-expected * log(in))
bool is_epoch
if true a epoch has been trained, else it returns the
metrics for a single batch and only some members are set.
double training_loss
The average loss for the training dataset for the epoch (if
is_epochis false it is the loss of the single batch)
double validation_loss
The average loss for the validation dataset for the epoch
(not set if
is_epochis false)
double training_time_ms
The combined time for the training dataset for the epoch (if
is_epochis false it is not set)
double validation_time_ms
The time for the validation dataset for the epoch (if
is_epochis false it is not set)
double avg_batch_time_ms
Average time for passing a batch through the model (if
is_epochis false, it is the time of the single batch)
std::vector<std::pair<std::string, double>> avg_batch_time_per_layer_ms
Average time for passing a batch through the model per layer
(if
is_epochis false, it is the time of the single batch). Each layer is given with its name and its execution time.
Trainer(GraphModel *model, DataLoader *dl, Optimizer *opt, LossFunction *loss) : model(model), data(dl), optimizer(opt), loss(loss)
Initializes the data of the Trainer.
The
DataLoader,
GraphModeland
Optimizerhave to be maintained by whoever passed them and they have to live at least as long as the
Trainer. The data for the training and validation will be taken from the
DataLoader. The model
modelwill be trained. The
optoptimizer will be used to optimize the weights after each batch is passed through the model. The
lossLoss function calculates the loss between the output of the model and the expected output from the labeled dataset.
Trainer(GraphModel *model) : model(model)
Initializes the model that should be trained by the Trainer.
The
GraphModelhas to be maintained by whoever passed it and it has to live at least as long as the
Trainer. The model
modelwill be trained.
void enable_early_stopping(double error)
Enables the early stopping criterion for the following
training runs. I.e. even if the minimum number of epochs is
not reached, training will stop once the validation error
reaches or is bellow the given minimum.
void set_data_loader(DataLoader *dl)
Sets the data of the Trainer.
The
DataLoaderhas to be maintained by whoever passed it and it has to live at least as long as the trainer. The data for the training and validation will be taken from the
DataLoader.
void set_optimizer(Optimizer *opt)
The
Optimizerhas to be maintained by whoever passed it and it has to live at least as long as the trainer. It will be used to optimize the weights after each batch is passed through the model.
void set_loss(LossFunction *loss)
The
LossFunctionhas to be maintained by whoever passed it and it has to live at least as long as the trainer. It will be used to calculate the error of the model per batch for optimization.
void set_metric_reporter(MetricReporter *reporter)
Sets the metric reporter (to print or display information about the
training process)
TrainingMetrics train_epoch()
Trains exactly one epoch, i.e., the complete dataset is
passed through the model by splitting it into
batch_sizebatches and passing them through the model. The weights of the model are optimized for each batch. If a validation dataset is available in the dataloader it is evaluated. This method returns information (average loss, validation loss, total time, etc.) about the training. If a
TrainingReporteris set, it reports the metrics per batch
void train(size_t epochs)
Trains the model for
epochsnumber of epochs. The complete dataset is passed through the model per epoch (It is split into
batch_sizesized slices in the first dimension of the input data and each batch has to be passed through the model once per epoch). The weights of the model are optimized after each batch. Once all batches have been run for a epoch, the validation data is passed through the model and the error reported. Make sure the
DataLoaderand
Modelare valid for the call of this function.