RegressionTreeGB4TS

scalation.modeling.forecasting_old.RegressionTreeGB4TS
See theRegressionTreeGB4TS companion object
class RegressionTreeGB4TS(x: MatrixD, yy: VectorD, lags: Int, fname: Array[String], hparam: HyperParameter) extends RegressionTreeGB, ForecasterX

The RegressionTreeGB4TS class supports Gradient Boosting for Time Series data. Multi-horizon forecasting supported via the Recursive method. Given a response vector y, a predictor matrix x is built that consists of lagged y vectors. Additional future response vectors are built for training.

y_t = f(x)

where x = [y_{t-1}, y_{t-2}, ... y_{t-lags}].

Value parameters

fname

the feature/variable names

hparam

the hyper-parameters (use REGRESSION.hp for default)

lags

the maximum lag included (inclusive)

x

the input/predictor matrix built out of lags of y (and optionally from exogenous variables ex)

yy

the output/response vector trimmed to match x.dim (@see ARX object)

Attributes

Companion
object
Graph
Supertypes
trait ForecasterX
trait Fit
trait FitM
trait Predictor
trait Model
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def forecast(t: Int, yf: MatrixD, h: Int): VectorD

Produce a vector of size h, of 1 through h-steps ahead forecasts for the model. forecast the following time points: t+1, ..., t-1+h. Note, must create the yf matrix before calling the forecast method. Intended to work with rolling validation (analog of predict method) Must call forecastAll first.

Produce a vector of size h, of 1 through h-steps ahead forecasts for the model. forecast the following time points: t+1, ..., t-1+h. Note, must create the yf matrix before calling the forecast method. Intended to work with rolling validation (analog of predict method) Must call forecastAll first.

Value parameters

h

the forecasting horizon, number of steps ahead to produce forecasts

t

the time point from which to make forecasts

yf

the forecast matrix (time x horizons)

Attributes

def forecastAt(yf: MatrixD, yx: MatrixD, h: Int): VectorD

Forecast values for all y_.dim time points at horizon h (h-steps ahead). Assign to FORECAST MATRIX and return h-step ahead forecast. Note, predictAll provides predictions for h = 1.

Forecast values for all y_.dim time points at horizon h (h-steps ahead). Assign to FORECAST MATRIX and return h-step ahead forecast. Note, predictAll provides predictions for h = 1.

Value parameters

h

the forecasting horizon, number of steps ahead to produce forecasts

yf

the forecast matrix for the endogenous variable y (time x horizons)

yx

the matrix of endogenous y and exogenous x values

Attributes

See also

forecastAll method in Forecaster trait.

Get the internally row trimed and column expanded input matrix and reposnse vector.

Get the internally row trimed and column expanded input matrix and reposnse vector.

Attributes

def predict(t: Int, yx: MatrixD): Double

Predict a value for y_t+1 using the 1-step ahead forecast. y_t+1 = f (y_t, ...) + e_t+1

Predict a value for y_t+1 using the 1-step ahead forecast. y_t+1 = f (y_t, ...) + e_t+1

Value parameters

t

the time point from which to make prediction

yx

the matrix of endogenous y and exogenous x values

Attributes

def testF(h: Int, y_: VectorD, yx: MatrixD): (VectorD, VectorD, VectorD)

Test FORECASTS of a RegressionTreeGB4TS forecasting model y_ = f(x) + e and return its forecasts and QoF vector. Testing may be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train and forecastAll before testF.

Test FORECASTS of a RegressionTreeGB4TS forecasting model y_ = f(x) + e and return its forecasts and QoF vector. Testing may be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train and forecastAll before testF.

Value parameters

h

the forecasting horizon, number of steps ahead to produce forecasts

y_

the testing/full response/output vector

yx

the matrix of endogenous y and exogenous x values

Attributes

Inherited methods

def backwardElim(cols: LinkedHashSet[Int], first: Int)(using qk: Int): BestStep

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter vector and the new Quality of Fit (QoF). May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

first

first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Predictor
def backwardElimAll(first: Int, cross: Boolean)(using qk: Int): (LinkedHashSet[Int], MatrixD)

Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Perform backward elimination to find the least predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

first

first variable to consider for elimination

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Predictor
override def buildModel(x_cols: MatrixD): RegressionTreeGB

Build a sub-model that is restricted to the given columns of the data matrix.

Build a sub-model that is restricted to the given columns of the data matrix.

Value parameters

x_cols

the columns that the new model is restricted to

Attributes

Definition Classes
Inherited from:
RegressionTreeGB
def crossValidate(k: Int, rando: Boolean): Array[Statistic]

Attributes

Inherited from:
Predictor
override def diagnose(y_: VectorD, yp_: VectorD, w: VectorD): VectorD

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

Diagnose the health of the model by computing the Quality of Fit (QoF) measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted.

Value parameters

w

the weights on the instances (defaults to null)

y_

the actual response/output vector to use (test/full)

yp_

the predicted response/output vector (test/full)

Attributes

See also

Regression_WLS

Definition Classes
Fit -> FitM
Inherited from:
Fit
def diagnose_(y: VectorD, yp: VectorD, low: VectorD, up: VectorD, alpha: Double, w: VectorD): VectorD

Diagnose the health of the model by computing the Quality of Fit (QoF) metrics/measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted. Include interval measures. Note: wis should be computed separately.

Diagnose the health of the model by computing the Quality of Fit (QoF) metrics/measures, from the error/residual vector and the predicted & actual responses. For some models the instances may be weighted. Include interval measures. Note: wis should be computed separately.

Value parameters

alpha

the nominal level of uncertainty (alpha) (defaults to 0.9, 90%)

low

the predicted lower bound

up

the predicted upper bound

w

the weights on the instances (defaults to null)

y

the actual response/output vector to use (test/full)

yp

the point prediction mean/median

Attributes

See also

Regression_WLS

Inherited from:
Fit
def diagnose_wis(y: VectorD, yp: VectorD, low: MatrixD, up: MatrixD, alphas: Array[Double]): Double

Diagnose the health of the model by computing the Quality of Fit (QoF) measures,

Diagnose the health of the model by computing the Quality of Fit (QoF) measures,

Value parameters

alphas

the array of prediction levels

low

the lower bounds for various alpha levels

up

the upper bounds for various alpha levels

y

the given time-series (must be aligned with the interval forecast)

yp

the point prediction mean/median

Attributes

Inherited from:
Fit
override def fit: VectorD

Return the Quality of Fit (QoF) measures corresponding to the labels given. Note, if sse > sst, the model introduces errors and the rSq may be negative, otherwise, R^2 (rSq) ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.

Return the Quality of Fit (QoF) measures corresponding to the labels given. Note, if sse > sst, the model introduces errors and the rSq may be negative, otherwise, R^2 (rSq) ranges from 0 (weak) to 1 (strong). Override to add more quality of fit measures.

Attributes

Definition Classes
Fit -> FitM
Inherited from:
Fit
def forecastAll(y_: VectorD, yx: MatrixD, h: Int): MatrixD

Forecast values for all y_.dim time points and all horizons (1 through h-steps ahead). Record these in the FORECAST MATRIX yf, where yf(t, k) = k-steps ahead forecast for y_t Note, column 0, yf(?, 0), is set to y (the actual time-series values). last column, yf(?, h+1), is set to t (the time values, for reference). Forecast recursively down diagonals in the yf forecast matrix. The top right and bottom left triangles in yf matrix are not forecastable.

Forecast values for all y_.dim time points and all horizons (1 through h-steps ahead). Record these in the FORECAST MATRIX yf, where yf(t, k) = k-steps ahead forecast for y_t Note, column 0, yf(?, 0), is set to y (the actual time-series values). last column, yf(?, h+1), is set to t (the time values, for reference). Forecast recursively down diagonals in the yf forecast matrix. The top right and bottom left triangles in yf matrix are not forecastable.

Value parameters

h

the maximum forecasting horizon, number of steps ahead to produce forecasts

y_

the actual values to use in making forecasts

yx

the matrix of endogenous y and exogenous x values

Attributes

Inherited from:
ForecasterX
def forecastAtI(y_: VectorD, yfh: VectorD, h: Int, p: Double): (VectorD, VectorD)

Forecast intervals for all y_.dim time points at horizon h (h-steps ahead). Create prediction intervals (two vectors) for the given time points at level p. Caveat: assumes errors follow a Normal distribution. Override this method to handle other cases.

Forecast intervals for all y_.dim time points at horizon h (h-steps ahead). Create prediction intervals (two vectors) for the given time points at level p. Caveat: assumes errors follow a Normal distribution. Override this method to handle other cases.

Value parameters

h

the forecasting horizon, number of steps ahead to produce forecasts

p

the level (1 - alpha) for the prediction interval

y_

the aligned actual values to use in making forecasts

yfh

the forecast vector at horizon h

Attributes

Inherited from:
ForecasterX
def forwardSel(cols: LinkedHashSet[Int])(using qk: Int): BestStep

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Predictor
def forwardSelAll(cross: Boolean)(using qk: Int): (LinkedHashSet[Int], MatrixD)

Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Perform forward selection to find the most predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Predictor
def fullModel(qk: Int): BestStep

Run the full model before variable elimination as a starting point for backward elimination.

Run the full model before variable elimination as a starting point for backward elimination.

Value parameters

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

Inherited from:
Predictor

Return the best model found from feature selection.

Return the best model found from feature selection.

Attributes

Inherited from:
Predictor
def getFname: Array[String]

Return the feature/variable names.

Return the feature/variable names.

Attributes

Inherited from:
Predictor
def getX: MatrixD

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_, e.g., SymbolicRegression.quadratic adds squared columns.

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_, e.g., SymbolicRegression.quadratic adds squared columns.

Attributes

Inherited from:
Predictor
def getY: VectorD

Return the used response vector y. Mainly for derived classes where y is transformed, e.g., TranRegression, ARX.

Return the used response vector y. Mainly for derived classes where y is transformed, e.g., TranRegression, ARX.

Attributes

Inherited from:
Predictor

Return the the y-transformation.

Return the the y-transformation.

Attributes

Inherited from:
Fit
def getYY: MatrixD

Return the used response matrix y, if needed.

Return the used response matrix y, if needed.

Attributes

See also

neuralnet.PredictorMV

Inherited from:
Model
override def help: String

Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit trait. Override to correspond to fitLabel.

Return the help string that describes the Quality of Fit (QoF) measures provided by the Fit trait. Override to correspond to fitLabel.

Attributes

Definition Classes
Fit -> FitM
Inherited from:
Fit

Return the hyper-parameters.

Return the hyper-parameters.

Attributes

Inherited from:
Predictor
def importance(cols: Array[Int], rSq: MatrixD): Array[(Int, Double)]

Return the relative importance of selected variables, ordered highest to lowest, rescaled so the highest is one.

Return the relative importance of selected variables, ordered highest to lowest, rescaled so the highest is one.

Value parameters

cols

the selected columns/features/variables

rSq

the matrix R^2 values (stand in for sse)

Attributes

Inherited from:
Predictor
def ll(ms: Double, s2: Double, m2: Int): Double

The log-likelihood function times -2. Override as needed.

The log-likelihood function times -2. Override as needed.

Value parameters

ms

raw Mean Squared Error

s2

MLE estimate of the population variance of the residuals

Attributes

See also
Inherited from:
Fit
def mse_: Double

Return the mean of the squares for error (sse / df). Must call diagnose first.

Return the mean of the squares for error (sse / df). Must call diagnose first.

Attributes

Inherited from:
Fit
def numTerms: Int

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Attributes

Inherited from:
Predictor

Return the vector of parameter/coefficient values.

Return the vector of parameter/coefficient values.

Attributes

Inherited from:
Predictor
override def predict(z: VectorD): Double

Given a data vector z, predict a value by summing the predicted values for each tree moderated by the learning rate eta.

Given a data vector z, predict a value by summing the predicted values for each tree moderated by the learning rate eta.

Value parameters

z

the data vector to predict

Attributes

Definition Classes
Inherited from:
RegressionTreeGB
override def predict(z: MatrixD): VectorD

Given a data matrix z, predict the value by summing the predict for each tree, for each row of the matrix.

Given a data matrix z, predict the value by summing the predict for each tree, for each row of the matrix.

Value parameters

z

the data matrix to predict

Attributes

Definition Classes
Inherited from:
RegressionTreeGB
def rSq0_: Double

Attributes

Inherited from:
FitM
def rSq_: Double

Return the coefficient of determination (R^2). Must call diagnose first.

Return the coefficient of determination (R^2). Must call diagnose first.

Attributes

Inherited from:
FitM
def report(ftMat: MatrixD): String

Return a basic report on a trained and tested multi-variate model.

Return a basic report on a trained and tested multi-variate model.

Value parameters

ftMat

the matrix of qof values produced by the Fit trait

Attributes

Inherited from:
Model
def report(ftVec: VectorD): String

Return a basic report on a trained and tested model.

Return a basic report on a trained and tested model.

Value parameters

ftVec

the vector of qof values produced by the Fit trait

Attributes

Inherited from:
Model
def resetBest(): Unit

Reset the best-step to default

Reset the best-step to default

Attributes

Inherited from:
Predictor
def resetDF(df_update: (Double, Double)): Unit

Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.

Reset the degrees of freedom to the new updated values. For some models, the degrees of freedom is not known until after the model is built.

Value parameters

df_update

the updated degrees of freedom (model, error)

Attributes

Inherited from:
Fit
def select0(qk: Int): BestStep

Evalaute the model with only one column, e.g., intercept only model.

Evalaute the model with only one column, e.g., intercept only model.

Value parameters

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

Inherited from:
Predictor
def selectFeatures(tech: SelectionTech, cross: Boolean)(using qk: Int): (LinkedHashSet[Int], MatrixD)

Perform feature selection to find the most predictive features/variables to have in the model, returning the features/variables added and the new Quality of Fit (QoF) measures/metrics for all steps.

Perform feature selection to find the most predictive features/variables to have in the model, returning the features/variables added and the new Quality of Fit (QoF) measures/metrics for all steps.

Value parameters

cross

whether to include the cross-validation QoF measure

qk

index of Quality of Fit (QoF) to use for comparing quality

tech

the feature selection technique to apply

Attributes

See also

Fit for index of QoF measures/metrics.

Inherited from:
FeatureSelection
def show_interval_forecasts(yy: VectorD, yfh: VectorD, low: VectorD, up: VectorD, qof_all: VectorD, h: Int): Unit

Show the prediction interval forecasts and relevant QoF metrics/measures.

Show the prediction interval forecasts and relevant QoF metrics/measures.

Value parameters

h

the forecasting horizon

low

the predicted lower bound

qof_all

all the QoF metrics (for point and interval forecasts)

up

the predicted upper bound

yfh

the forecasts for horizon h

yy

the aligned actual response/output vector to use (test/full)

Attributes

Inherited from:
Fit
inline def smapeF(y: VectorD, yp: VectorD, e_: VectorD): Double

Return the symmetric Mean Absolute Percentage Error (sMAPE) score. Caveat: y_i = yp_i = 0 => no error => no percentage error

Return the symmetric Mean Absolute Percentage Error (sMAPE) score. Caveat: y_i = yp_i = 0 => no error => no percentage error

Value parameters

e_

the error/residual vector (if null, recompute)

y

the given time-series (must be aligned with the forecast)

yp

the forecasted time-series

Attributes

Inherited from:
FitM
def sse_: Double

Return the sum of the squares for error (sse). Must call diagnose first.

Return the sum of the squares for error (sse). Must call diagnose first.

Attributes

Inherited from:
FitM
def stepwiseSelAll(cross: Boolean, swap: Boolean)(using qk: Int): (LinkedHashSet[Int], MatrixD)

Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls forwardSel and backwardElim and takes the best of the two actions. Stops when neither action yields improvement.

Perform stepwise regression to find the most predictive variables to have in the model, returning the variables left and the new Quality of Fit (QoF) measures for all steps. At each step it calls forwardSel and backwardElim and takes the best of the two actions. Stops when neither action yields improvement.

Value parameters

cross

whether to include the cross-validation QoF measure

qk

index of Quality of Fit (QoF) to use for comparing quality

swap

whether to allow a swap step (swap out a feature for a new feature in one step)

Attributes

See also

Fit for index of QoF measures.

Inherited from:
Predictor
override def summary(x_: MatrixD, fname: Array[String], b: VectorD, vifs: VectorD): String

Produce a QoF summary for a model with diagnostics for each predictor x_j and the overall Quality of Fit (QoF). Note: `Fac_Cholesky is used to compute the inverse of xtx.

Produce a QoF summary for a model with diagnostics for each predictor x_j and the overall Quality of Fit (QoF). Note: `Fac_Cholesky is used to compute the inverse of xtx.

Value parameters

b

the parameters/coefficients for the model

fname

the array of feature/variable names

vifs

the Variance Inflation Factors (VIFs)

x_

the testing/full data/input matrix

Attributes

Definition Classes
Fit -> FitM
Inherited from:
Fit
def swapVars(cols: LinkedHashSet[Int], out: Int, in: Int, qk: Int): BestStep

Swap out variable with in variable.

Swap out variable with in variable.

Value parameters

cols

the columns of matrix x currently included in the existing model

in

the variable to swap in

out

the variable to swap out

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

Inherited from:
Predictor
def test(x_: MatrixD, y_: VectorD): (VectorD, VectorD)

Test a predictive model y_ = f(x_) + e and return its QoF vector. Testing may be be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Test a predictive model y_ = f(x_) + e and return its QoF vector. Testing may be be in-sample (on the training set) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Value parameters

x_

the testing/full data/input matrix (defaults to full x)

y_

the testing/full response/output vector (defaults to full y)

Attributes

Inherited from:
RegressionTreeGB
inline def testIndices(n_test: Int, rando: Boolean): IndexedSeq[Int]

Return the indices for the test-set.

Return the indices for the test-set.

Value parameters

n_test

the size of test-set

rando

whether to select indices randomly or in blocks

Attributes

See also

scalation.mathstat.TnT_Split

Inherited from:
Predictor
def train(x_: MatrixD, y_: VectorD): Unit

Use Gradient Boosting for Training. For every iteration, evaluate the residual and form a Regression Tree where the residual is the dependent value (equal to the gradient if using SSE as the loss function).

Use Gradient Boosting for Training. For every iteration, evaluate the residual and form a Regression Tree where the residual is the dependent value (equal to the gradient if using SSE as the loss function).

Value parameters

x_

the training/full data/input matrix

y_

the training/full response/output vector

Attributes

Inherited from:
RegressionTreeGB
def train2(x_: MatrixD, y_: VectorD): Unit

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

Attributes

Inherited from:
Predictor
def trainNtest(x_: MatrixD, y_: VectorD)(xx: MatrixD, yy: VectorD): (VectorD, VectorD)

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. Return the predictions and QoF. FIX - currently must override if y is transformed, @see TranRegression

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. Return the predictions and QoF. FIX - currently must override if y is transformed, @see TranRegression

Value parameters

x_

the training/full data/input matrix (defaults to full x)

xx

the testing/full data/input matrix (defaults to full x)

y_

the training/full response/output vector (defaults to full y)

yy

the testing/full response/output vector (defaults to full y)

Attributes

Inherited from:
Predictor
def validate(rando: Boolean, ratio: Double)(idx: IndexedSeq[Int]): VectorD

Attributes

Inherited from:
Predictor
def vif(skip: Int): VectorD

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Value parameters

skip

the number of columns of x at the beginning to skip in computing VIF

Attributes

Inherited from:
Predictor

Inherited fields

var modelConcept: URI

The optional reference to an ontological concept

The optional reference to an ontological concept

Attributes

Inherited from:
Model
var modelName: String

The name for the model (or modeling technique).

The name for the model (or modeling technique).

Attributes

Inherited from:
Model