mbtr package¶
Submodules¶
mbtr.losses module¶
-
class
mbtr.losses.
FourierLoss
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
Loss for the Fourier regression:
\[\mathcal{L} = \Vert y - P x\Vert_2^2\]where \(P\) is the projection matrix:
\[P=\left[\left[\cos \left(k \frac{2 \pi t}{n_{t}}\right)^{T}, \sin \left(k \frac{2 \pi t}{n_{t}}\right)^{T}\right]^{T}\right]_{k \in \mathcal{K}}\]-
eval_optimal_loss
(G2, H)¶ Evaluate the optimal loss (using response obtained minimizing the second order loss approximation).
Parameters: - G2 – squared sum of gradients in the current leaf
- H – sum of Hessians diags in the current leaf
Returns: optimal loss, scalar
-
eval_optimal_response
(G, H)¶ Evaluate optimal response, given G and H.
Parameters: - G – mean gradient for the current leaf.
- H – mean Hessian for the current leaf.
Returns: optimal response under second order loss approximation.
-
get_initial_guess
(y)¶ Return an initial guess for the prediciton. This can be loss-specific.
Parameters: y – target matrix of the training set Returns: np.ndarray with initial guess
-
projection_matrix
¶ Return projection matrix for the Fourier coefficient estimation.
Parameters: n – number of observations Returns: projection matrix P, (2*n_harmonics, n_t) where n_harmonics is the number of harmonics to fit, n_t the target dimension
-
required_pars
= ['n_harmonics']¶
-
set_dimension
(n_dim)¶ Initialize all the properties which depends on the dimension of the target
Parameters: n_dims – dimension of the target Returns: None
-
-
class
mbtr.losses.
LatentVariable
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
Loss for the hierarchical reconciliation problem, in the form:
\[\mathcal{L} = \Vert y - S x\Vert_2^2\]where \(S\) is the hierarchy matrix. The initial guess is the mean of the last columns of y.
-
compute_H_inv
¶
-
compute_fast_H_hat
¶
-
eval_optimal_loss
(yy, H)¶ Evaluate the optimal loss (using response obtained minimizing the second order loss approximation).
Parameters: - G2 – squared sum of gradients in the current leaf
- H – sum of Hessians diags in the current leaf
Returns: optimal loss, scalar
-
eval_optimal_response
(G, H)¶ Evaluate optimal response, given G and H.
Parameters: - G – mean gradient for the current leaf.
- H – mean Hessian for the current leaf.
Returns: optimal response under second order loss approximation.
-
get_grad_and_hessian_diags
(y, y_hat, iteration, leaves_idx)¶ Return the loss gradient and loss Hessian’s diagonals based on the current model estimation y_hat and target y matrices. Instead of returning the full Hessian (a 3rd order tensor), the method returns only the Hessian diagonals for each observation, stored in a (n_obs, n_t) matrix. These diagonals are then used by the loss to reconstruct the full Hessian with appropriate dimensions and structure. Currently, full Hessians inferred by data are not supported.
Parameters: - y – target matrix (n_obs, n_t)
- y_hat – current target estimation matrix (n_obs, n_t)
- iteration – current number of iteration, generally not needed
- leaves_idx – leaves’ indexes for each observation in y, (n_obs, 1). This is needed for example by
mbtr.losses.QuadraticQuantileLoss
.
Returns: grad, hessian_diags tuple, each of which is a (n_obs, n_t) matrix
-
get_initial_guess
(y)¶ The initial guess is generated from the last columns of the target matrix, as:
\[y_0 = \left( \mathbb{E} y_b \right) S^T\]where \(\mathbb{E}\) is the expectation (row-mean), \(S\) is the hierarchy matrix, and \(y_b\) stands for the last columns of y, with dimension (n_obs, n_b), where n_b is the number of bottom series.
Parameters: y – target matrix of the training set Returns: np.ndarray with initial guess
-
required_pars
= ['S', 'precision']¶
-
set_dimension
(n_dims)¶ For the latent loss the number of dimensions is equal to the second dimension of the S matrix, and must not be inferred from the target
-
-
class
mbtr.losses.
LinRegLoss
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
-
eval_optimal_loss
(G, x)¶ Evaluate the optimal loss (using response obtained minimizing the second order loss approximation).
Parameters: - G – gradient for the current leaf.
- x – linear regression features for the current leaf.
Returns: optimal loss, scalar
-
eval_optimal_response
(G, x)¶ Evaluate optimal response, given G and x. This is done computing a Ridge regression with intercept
\[w = \left(\tilde{x}^T \tilde{x} + \lambda I \right)^{-1} \left(\tilde{x}^T G \right)\]where \(\tilde{x}\) is the \(x\) matrix augmented with an unitary column and \(\lambda\) is the Ridge coefficient.
Parameters: - G – gradient for the current leaf.
- x – linear regression features for the current leaf.
Returns: optimal response under second order loss approximation.
-
set_dimension
(n_dim)¶ Initialize all the properties which depends on the dimension of the target
Parameters: n_dims – dimension of the target Returns: None
-
-
class
mbtr.losses.
Loss
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
object
Loss function class. A loss is defined by its gradient and Hessians. Note that if your specific loss funciton requires some additional argument, you can specify it in the required_pars. Upon instantiation, this list will be used to check if loss_kwargs contains all the needed parameters. Each class inheriting from
mbtr.losses.Loss
must provide an H_inv method, computing the inverse of the Hessian.Parameters: - lambda_weights – quadratic penalization parameter for the leaves weights
- lambda_leaves – quadratic penalization parameter for the number of leaves
- loss_kwargs – additional parameters needed for a specific loss type
-
H_inv
¶ Computes the inverse of the Hessian, given the Hessian’s diagonal of the current leave. The default implements MSE inverse.
Parameters: H – current leaf Hessian’s diagonal (n_t) Returns: inv(H), (n_t, n_t)
-
eval
(y, y_hat, trees)¶ Evaluates the overall loss, which is composed by the tree’s loss plus weights and total leaves penalizations
Parameters: - y – observations
- y_hat – current :class: mbtr.MBT estimations
- trees – array of fitted trees up to the current iteartion
Returns: tree loss and regularizations loss tuple, scalars
-
eval_optimal_loss
(G2, H)¶ Evaluate the optimal loss (using response obtained minimizing the second order loss approximation).
Parameters: - G2 – squared sum of gradients in the current leaf
- H – sum of Hessians diags in the current leaf
Returns: optimal loss, scalar
-
eval_optimal_response
(G, H)¶ Evaluate optimal response, given G and H.
Parameters: - G – mean gradient for the current leaf.
- H – mean Hessian for the current leaf.
Returns: optimal response under second order loss approximation.
-
get_grad_and_hessian_diags
(y, y_hat, iteration, leaves_idx)¶ Return the loss gradient and loss Hessian’s diagonals based on the current model estimation y_hat and target y matrices. Instead of returning the full Hessian (a 3rd order tensor), the method returns only the Hessian diagonals for each observation, stored in a (n_obs, n_t) matrix. These diagonals are then used by the loss to reconstruct the full Hessian with appropriate dimensions and structure. Currently, full Hessians inferred by data are not supported.
Parameters: - y – target matrix (n_obs, n_t)
- y_hat – current target estimation matrix (n_obs, n_t)
- iteration – current number of iteration, generally not needed
- leaves_idx – leaves’ indexes for each observation in y, (n_obs, 1). This is needed for example by
mbtr.losses.QuadraticQuantileLoss
.
Returns: grad, hessian_diags tuple, each of which is a (n_obs, n_t) matrix
-
get_initial_guess
(y)¶ Return an initial guess for the prediciton. This can be loss-specific.
Parameters: y – target matrix of the training set Returns: np.ndarray with initial guess
-
required_pars
= []¶
-
set_dimension
(n_dims)¶ Initialize all the properties which depends on the dimension of the target
Parameters: n_dims – dimension of the target Returns: None
-
tree_loss
(y, y_hat)¶ Compute the tree loss (without penalizations)
Parameters: - y – observations of the target on the traning set
- y_hat – current estimation of the MBT
Returns: tree loss
-
class
mbtr.losses.
MSE
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
Mean Squared Error loss, a.k.a. L2, Ordinary Least Squares.
\[\mathcal{L} = \Vert y - w\Vert_2^2 + \frac{1}{2} w^T \Lambda w\]where \(\Lambda\) is the quadratic punishment matrix.
Parameters: - lambda_weights – quadratic penalization parameter for the leaves weights
- lambda_leaves – quadratic penalization parameter for the number of leaves
- loss_kwargs – additional parameters needed for a specific loss type
-
eval_optimal_loss
(G2, H)¶ Evaluate the optimal loss (using response obtained minimizing the second order loss approximation).
Parameters: - G2 – squared sum of gradients in the current leaf
- H – sum of Hessians diags in the current leaf
Returns: optimal loss, scalar
-
eval_optimal_response
(G, H)¶ Evaluate optimal response, given G and H.
Parameters: - G – mean gradient for the current leaf.
- H – mean Hessian for the current leaf.
Returns: optimal response under second order loss approximation.
-
set_dimension
(n_dim)¶ Initialize all the properties which depends on the dimension of the target
Parameters: n_dims – dimension of the target Returns: None
-
class
mbtr.losses.
QuadraticQuantileLoss
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
-
H_inv
(H)¶ Computes the inverse of the Hessian, given the Hessian’s diagonal of the current leave. The default implements MSE inverse.
Parameters: H – current leaf Hessian’s diagonal (n_t) Returns: inv(H), (n_t, n_t)
-
exact_response
(y)¶
-
get_grad_and_hessian_diags
(y, y_hat, iteration, leaves_idx)¶ Return the loss gradient and loss Hessian’s diagonals based on the current model estimation y_hat and target y matrices. Instead of returning the full Hessian (a 3rd order tensor), the method returns only the Hessian diagonals for each observation, stored in a (n_obs, n_t) matrix. These diagonals are then used by the loss to reconstruct the full Hessian with appropriate dimensions and structure. Currently, full Hessians inferred by data are not supported.
Parameters: - y – target matrix (n_obs, n_t)
- y_hat – current target estimation matrix (n_obs, n_t)
- iteration – current number of iteration, generally not needed
- leaves_idx – leaves’ indexes for each observation in y, (n_obs, 1). This is needed for example by
mbtr.losses.QuadraticQuantileLoss
.
Returns: grad, hessian_diags tuple, each of which is a (n_obs, n_t) matrix
-
get_initial_guess
(y)¶ The initial guess are the alpha quantiles of the target matrix y.
Parameters: y – target matrix of the training set Returns: np.ndarray with initial guess
-
required_pars
= ['alphas']¶
-
tree_loss
(y, y_hat)¶ Compute the tree loss (without penalizations)
Parameters: - y – observations of the target on the traning set
- y_hat – current estimation of the MBT
Returns: tree loss
-
-
class
mbtr.losses.
QuantileLoss
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
-
H_inv
(H)¶ Computes the inverse of the Hessian, given the Hessian’s diagonal of the current leave. The default implements MSE inverse.
Parameters: H – current leaf Hessian’s diagonal (n_t) Returns: inv(H), (n_t, n_t)
-
exact_response
(y)¶
-
get_grad_and_hessian_diags
(y, y_hat, iteration, leaves_idx)¶ Return the loss gradient and loss Hessian’s diagonals based on the current model estimation y_hat and target y matrices. Instead of returning the full Hessian (a 3rd order tensor), the method returns only the Hessian diagonals for each observation, stored in a (n_obs, n_t) matrix. These diagonals are then used by the loss to reconstruct the full Hessian with appropriate dimensions and structure. Currently, full Hessians inferred by data are not supported.
Parameters: - y – target matrix (n_obs, n_t)
- y_hat – current target estimation matrix (n_obs, n_t)
- iteration – current number of iteration, generally not needed
- leaves_idx – leaves’ indexes for each observation in y, (n_obs, 1). This is needed for example by
mbtr.losses.QuadraticQuantileLoss
.
Returns: grad, hessian_diags tuple, each of which is a (n_obs, n_t) matrix
-
get_initial_guess
(y)¶ The initial guess are the alpha quantiles of the target matrix y.
Parameters: y – target matrix of the training set Returns: np.ndarray with initial guess
-
quantile_loss
(y, q_hat)¶ Quantile loss function, a.k.a. pinball loss.
\[ \begin{align}\begin{aligned}\epsilon (y,\hat{q})_{\alpha} &= \hat{q}_{\alpha} - y\\\mathcal{L} (y,\hat{q})_{\alpha} &= \epsilon (y,\hat{q})_{\alpha} \left( I_{\epsilon_{\alpha}\geq 0} -\alpha \right)\end{aligned}\end{align} \]Parameters: - y – observations of the target on the traning set
- q_hat – current estimation matrix of the quantiles
Returns: tree loss
-
required_pars
= ['alphas']¶
-
-
class
mbtr.losses.
TimeSmoother
(lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
mbtr.losses.Loss
Time-smoothing loss function. Penalizes the time-derivative of the predicted signal.
\[\mathcal{L} = \frac{1}{2}\Vert y-w\Vert_2^{2} + \frac{1}{2} w^T \left(\lambda_s D^T D + \lambda I \right) w\]where \(D\) is the second order difference matrix
\[ \begin{align}\begin{aligned}D=\left[\begin{array}{rrrrrr} 1 & -2 & 1 & & &\\& 1 & -2 & 1 & &\\& & \ddots & \ddots & \ddots &\\& & & 1 & -2 & 1\\& & & & 1 & -2 & 1 \end{array}\right]\end{aligned}\end{align} \]and \(\lambda_s\) is the coefficient for the quadratic penalization of time-derivatives. Required parameters: lambda_smooth: coefficient for the quadratic penalization of time-derivatives
-
static
build_filter_mat
(n)¶ Build the second order difference matrix
Parameters: n – target dimension Returns: D, second order difference matrix
-
compute_fast_H_inv
¶
-
required_pars
= ['lambda_smooth']¶
-
set_dimension
(n_dim)¶ Initialize all the properties which depends on the dimension of the target
Parameters: n_dims – dimension of the target Returns: None
-
update_smoothing_mat
(smoothing_weights)¶
-
static
mbtr.mbtr module¶
-
class
mbtr.mbtr.
MBT
(n_boosts: int = 20, early_stopping_rounds: int = 3, learning_rate: float = 0.1, val_ratio: int = 0, n_q: int = 10, min_leaf: int = 100, loss_type: str = 'mse', lambda_weights: float = 0.1, lambda_leaves: float = 0.1, verbose: int = 0, refit=True, **loss_kwargs)¶ Bases:
object
Multivariate Boosted Tree class. Fits a multivariate tree using boosting.
Parameters: - n_boosts – maximum number of boosting rounds. Default: 20
- early_stopping_rounds – if the total loss is non-decreasing after early_stopping_rounds, stop training. The final model is the one which achieved the lowest loss up to the final iteration. Default: 3.
- learning_rate – in [0, 1]. A learning rate < 1 helps reducing overfitting. Default: 0.1.
- val_ratio – in [0,1]. If provided, the early stop is triggered by the loss computed on a validation set, randomly extracted from the training set. The length of the validation set is val_ratio * len (training set). Default: 0.
- n_q – number of quantiles for the split search. Default: 10.
- min_leaf – minimum number of observations in one leaf. This parameter greatly affect generalization abilities. Default: 100.
- loss_type –
loss type for choosing the best splits. Currently the following losses are implemented:
mse: mean squared error loss, a.k.a. L2, ordinary least squares
time_smoother: mse with an additional penalization on the second order differences of the response function. Requires to pass also lambda_smooth parameter.
latent_variable: generate response function of dimension n_t from an arbitrary linear combination of n_r responses. This requires to pass also S and precision pars.
linear_regression: mse with linear response function. Using this loss function, when calling fit and predict methods, one must also pass x_lr as additional argument, which is the matrix of features used to train the linear response inside the leaf (which can be different from the features used to grow the tree, x).
fourier: mse with linear response function, fitted on the first n_harmonics (where the fundamental has wave-lenght equal to the target output). This requires to pass also the n_harmonics parameter.
quantile: quantile loss function, a.k.a. pinball loss. This requires to pass also the alphas parameter, a list of quantiles to be fitted.
quadratic_quantile: quadratic quantile loss function tailored for trees. It has a non-discontinuos derivative. This requires to pass also the alphas parameter, a list of quantiles to be fitted.
- lambda_weights – coefficient for the quadratic regularization of the response’s parameters. Default: 0.1
- lambda_leaves – coefficient for the quadratic regularization of the total number of leaves. This is only used when the Tree is used as a weak learner by MBT. Default: 0.1
- verbose – in {0,1}. If set to 1, the MBT return fitting information at each iteration.
- refit – if True, if the loss function has an “exact_response” method, use it to refit the tree
- loss_kwargs – possible additional arguments for the loss function
-
fit
(x, y, do_plot=False, x_lr=None)¶ - Fits an MBT using the features specified in the matrix \(x\in\mathbb{R}^{n_{obs} \times n_{f}}\), in order to predict the targets in the matrix \(y\in\mathbb{R}^{n_{obs} \times n_{t}}\), where \(n_{obs}\) is the number of observations, \(n_{f}\) the number of features and \(n_{t}\) the dimension of the target.
Parameters: - x – feature matrix, np.ndarray.
- y – target matrix, np.ndarray.
- x_lr – features for fitting the linear response inside the leaves. This is only required if a LinearLoss is being used.
-
predict
(x, n=None, x_lr=None)¶ Predicts the target based on the feature matrix x (and linear regression features x_lr).
Parameters: - x – feature matrix, np.ndarray.
- n – predict up to the nth fitted tree. If None, predict all the trees. Default: None
- x_lr – linear regression feature matrix, np.ndarray. Only required if LinearLoss has been used.
Returns: target’s predictions
-
class
mbtr.mbtr.
Tree
(n_q: int = 10, min_leaf: int = 100, loss_type: str = 'mse', lambda_weights: float = 0.1, lambda_leaves: float = 0.1, **loss_kwargs)¶ Bases:
object
Tree class. Fits both univarite and multivariate targets. It implements histogram search for the decision of the splitting points.
Parameters: - n_q – number of quantiles for the split search
- min_leaf – minimum number of observations in one leaf. This parameter greatly affect generalization abilities.
- loss_type –
loss type for choosing the best splits. Currently the following losses are implemented:
mse: mean squared error loss, a.k.a. L2, ordinary least squares
time_smoother: mse with an additional penalization on the second order differences of the response function. Requires to pass also lambda_smooth parameter.
latent_variable: generate response function of dimension n_t from an arbitrary linear combination of n_r responses. This requires to pass also S and precision pars.
linear_regression: mse with linear response function. Using this loss function, when calling fit and predict methods, one must also pass x_lr as additional argument, which is the matrix of features used to train the linear response inside the leaf (which can be different from the features used to grow the tree, x).
fourier: mse with linear response function, fitted on the first n_harmonics (where the fundamental has wave-lenght equal to the target output). This requires to pass also the n_harmonics parameter.
quantile: quantile loss function, a.k.a. pinball loss. This requires to pass also the alphas parameter, a list of quantiles to be fitted.
quadratic_quantile: quadratic quantile loss function tailored for trees. It has a non-discontinuos derivative. This requires to pass also the alphas parameter, a list of quantiles to be fitted.
- lambda_weights – coefficient for the quadratic regularization of the response’s parameters
- lambda_leaves – coefficient for the quadratic regularization of the total number of leaves. This is only used
when the Tree is used as a weak learner by MBT. :param loss_kwargs: possible additional arguments for the loss function
-
compute_loss
(G2_left, G2_right, H_left, H_right, j)¶
-
fit
(x, y, hessian=None, learning_rate=1.0, x_lr=None)¶ Fits a tree using the features specified in the matrix \(x\in\mathbb{R}^{n_{obs} \times n_{f}}\), in order to predict the targets in the matrix \(y\in\mathbb{R}^{n_{obs} \times n_{t}}\), where \(n_{obs}\) is the number of observations, \(n_{f}\) the number of features and \(n_{t}\) the dimension of the target.
Parameters: - x – feature matrix, np.ndarray.
- y – target matrix, np.ndarray.
- hessian – diagonals of the hessians \(\in\mathbb{R}^{n_{obs} \times n_{t}}\). If None, each entry is set equal to one (this will result in the default behaviour under MSE loss). Default: None
- learning_rate – learning rate used by the MBT instance. Default: 1
- x_lr – features for fitting the linear response inside the leaves. This is only required if a LinearLoss is being used.
-
predict
(x, x_lr=None)¶ Predicts the target based on the feature matrix x (and linear regression features x_lr).
param: x: feature matrix, np.ndarray. param: x_lr: linear regression feature matrix, np.ndarray.
Returns: target’s predictions
-
mbtr.mbtr.
bin_sums
¶
-
mbtr.mbtr.
leaf_stats
¶
mbtr.utils module¶
-
mbtr.utils.
check_pars
(required_pars, **kwargs)¶
-
mbtr.utils.
download_dataset
()¶
-
mbtr.utils.
load_dataset
()¶
-
mbtr.utils.
set_figure
(size, subplots=(1, 1), context='paper', style='darkgrid', font_scale=1, l=0.2, w=0.1, h=0.1, b=0.1)¶