#include <CGAL/Classification/ETHZ_random_forest_classifier.h>

Definition

Classifier based on the ETH Zurich version of random forest algorithm [2].

Note: This classifier is distributed under the MIT license.

Is Model Of:: CGAL::Classification::Classifier

Constructor
	ETHZ_random_forest_classifier (const Label_set &labels, const Feature_set &features)
	Instantiate the classifier using the sets of `labels` and `features`.

Training
template<typename LabelIndexRange >
void	train (const LabelIndexRange &ground_truth, bool reset_trees=true, std::size_t num_trees=25, std::size_t max_depth=20)
	Runs the training algorithm. More...

Input/Output
void	save_configuration (std::ostream &output)
	Saves the current configuration in the stream `output`. More...

void	load_configuration (std::istream &input)
	Loads a configuration from the stream `input`. More...

Member Function Documentation

◆ load_configuration()

void CGAL::Classification::ETHZ_random_forest_classifier::load_configuration ( std::istream & input )

Loads a configuration from the stream input.

The input file should be a GZIP container written by the save_configuration() method. The feature set of the classifier should contain the exact same features in the exact same order as the ones present when the file was generated using save_configuration().

◆ save_configuration()

void CGAL::Classification::ETHZ_random_forest_classifier::save_configuration ( std::ostream & output )

Saves the current configuration in the stream output.

This allows to easily save and recover a specific classification configuration.

The output file is written in an GZIP container that is readable by the load_configuration() method.

◆ train()

template<typename LabelIndexRange >

void CGAL::Classification::ETHZ_random_forest_classifier::train	(	const LabelIndexRange &	ground_truth,
		bool	reset_trees = `true`,
		std::size_t	num_trees = `25`,
		std::size_t	max_depth = `20`
	)

Runs the training algorithm.

From the set of provided ground truth, this algorithm estimates sets up the random trees that produce the most accurate result with respect to this ground truth.

Precondition: At least one ground truth item should be assigned to each label.

Parameters

ground_truth	vector of label indices. It should contain for each input item, in the same order as the input set, the index of the corresponding label in the `Label_set` provided in the constructor. Input items that do not have a ground truth information should be given the value `-1`.
reset_trees	should be set to `false` if the users wants to add new trees to the existing forest, and kept to `true` if the training should be recomputing from scratch (discarding the current forest).
num_trees	number of trees generated by the training algorithm. Higher values may improve result at the cost of higher computation times (in general, using a few dozens of trees is enough).
max_depth	maximum depth of the trees. Higher values will improve how the forest fits the training set. A overly low value will underfit the test data and conversely an overly high value will likely overfit.

Definition

Constructor

Training

Input/Output

Member Function Documentation

◆ load_configuration()

◆ save_configuration()

◆ train()