PredictorMV

scalation.modeling.neuralnet.PredictorMV
See thePredictorMV companion object
trait PredictorMV(x: MatrixD, y: MatrixD, var fname: Array[String], hparam: HyperParameter) extends Model, FeatureSelection

The PredictorMV trait provides a framwork for multiple predictive analytics techniques, e.g., Multi-variate Regression and Neural Netoworks. x is multi-dimensional [1, x_1, ... x_k] and so is y. Fit the NetParam parameters bb in for example the regression equation y = f(bb dot x) + e bb is an array of NetParam where each component is a weight matrix and a bias vector.

Value parameters

fname

the feature/variable names (if null, use x_j's)

hparam

the hyper-parameters for the model/network

x

the input/data m-by-n matrix (augment with a first column of ones to include intercept in model or use bias)

y

the response/output m-by-ny matrix

Attributes

See also

NetParam

Companion
object
Graph
Supertypes
trait Model
class Object
trait Matchable
class Any
Known subtypes
class CNN_1D
class NeuralNet_2L
class NeuralNet_3L
class NeuralNet_XL
class RegressionMV
Show all

Members list

Value members

Abstract methods

def buildModel(x_cols: MatrixD, fname2: Array[String] = ...): PredictorMV & Fit

Build a sub-model that is restricted to the given columns of the data matrix. Override for models that support feature section.

Build a sub-model that is restricted to the given columns of the data matrix. Override for models that support feature section.

Value parameters

fname2

the variable/feature names for the new model (defaults to null)

x_cols

the columns that the new model is restricted to

Attributes

Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2). Must override when using transformations, e.g., ExpRegression.

Predict the value of y = f(z) by evaluating the formula y = b dot z, e.g., (b_0, b_1, b_2) dot (1, z_1, z_2). Must override when using transformations, e.g., ExpRegression.

Value parameters

z

the new vector to predict

Attributes

def test(x_: MatrixD = ..., y_: MatrixD = ...): (MatrixD, MatrixD)

Test the predictive model y_ = f(x_) + e and return its predictions and QoF matrix. Each variable predictions and QoF values are returned in columns of respective matrices. Testing may be in-sample (on the full dataset) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Test the predictive model y_ = f(x_) + e and return its predictions and QoF matrix. Each variable predictions and QoF values are returned in columns of respective matrices. Testing may be in-sample (on the full dataset) or out-of-sample (on the testing set) as determined by the parameters passed in. Note: must call train before test.

Value parameters

x_

the testing/full data/input matrix (defaults to full x)

y_

the testing/full response/output matrix (defaults to full y)

Attributes

def train(x_: MatrixD = ..., y_: MatrixD = ...): Unit

Train a predictive model y_ = f(x_) + e where x_ is the data/input matrix and y_ is the response/output matrix. These arguments default to the full dataset x and y, but may be restricted to a training dataset. Training involves estimating the model parameters bb.

Train a predictive model y_ = f(x_) + e where x_ is the data/input matrix and y_ is the response/output matrix. These arguments default to the full dataset x and y, but may be restricted to a training dataset. Training involves estimating the model parameters bb.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output matrix (defaults to full y)

Attributes

Concrete methods

def backwardElim(cols: LinkedHashSet[Int], first: Int = ...)(using qk: Int): BestStep

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter matrix and the new Quality of Fit (QoF). May be called repeatedly.

Perform backward elimination to find the least predictive variable to remove from the existing model, returning the variable to eliminate, the new parameter matrix and the new Quality of Fit (QoF). May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

first

first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

def backwardElimAll(first: Int = ..., cross: String = ...)(using qk: Int): (LinkedHashSet[Int], ArrayBuffer[VectorD])

Perform BACKWARD ELIMINATION to find the LEAST predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Perform BACKWARD ELIMINATION to find the LEAST predictive variables to remove from the full model, returning the variables left and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

indicator to include the cross-validation/validation QoF measure (defaults to "many")

first

first variable to consider for elimination

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

def beamSelAll(cross: String = ..., bk: Int = ...)(using qk: Int): (LinkedHashSet[Int], ArrayBuffer[VectorD])

Perform BEAM SEARCH SELECTION to find a GOOD COMBINATION of predictive features/variables to have in the model, returning the top k sets of features/variables selected and the new Quality of Fit (QoF) measures/metrics for all steps. At each step, iterate over the models in the beam (top k) and create candidates by adding features (phase 1) and then removing (phase 2). From all the candidates, keep the best k and start a new iteration. Stops when there is no improvement in any of top k (or the maximum number of features is reached.

Perform BEAM SEARCH SELECTION to find a GOOD COMBINATION of predictive features/variables to have in the model, returning the top k sets of features/variables selected and the new Quality of Fit (QoF) measures/metrics for all steps. At each step, iterate over the models in the beam (top k) and create candidates by adding features (phase 1) and then removing (phase 2). From all the candidates, keep the best k and start a new iteration. Stops when there is no improvement in any of top k (or the maximum number of features is reached.

Value parameters

bk

the beam width holding the top k models (defaults to 3)

cross

indicator to include the cross-validation/validation QoF measure (defaults to "many")

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures/metrics.

def crossValidate(k: Int = ..., rando: Boolean = ...): Array[Statistic]

Convert QoF results into an array (of size 1) of Statistic for compatibility with the crossValidate method.

Convert QoF results into an array (of size 1) of Statistic for compatibility with the crossValidate method.

Value parameters

qof

the Quality of Fit (QoF) results def qof2Stat (qof: MatrixD): Array [Statistic] = val stats = Fit.qofStatTable // create table for QoF measures if qof(QoF.sst.ordinal)(0) > 0.0 then // requires variation in test-set for q <- qof.indices do stats(q).tally (qof(q)(0)) // tally these QoF measures stats end qof2Stat

Attributes

def forwardSel(cols: LinkedHashSet[Int])(using qk: Int): BestStep

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Perform forward selection to find the most predictive variable to add the existing model, returning the variable to add and the new model. May be called repeatedly.

Value parameters

cols

the columns of matrix x currently included in the existing model

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

def forwardSelAll(cross: String = ...)(using qk: Int): (LinkedHashSet[Int], ArrayBuffer[VectorD])

Perform FORWARD SELECTION to find the MOST predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Perform FORWARD SELECTION to find the MOST predictive variables to have in the model, returning the variables added and the new Quality of Fit (QoF) measures for all steps.

Value parameters

cross

indicator to include the cross-validation/validation QoF measure (defaults to "many")

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

Return the best model found from feature selection.

Return the best model found from feature selection.

Attributes

def getFname: Array[String]

Return the feature/variable names.

Return the feature/variable names.

Attributes

def getX: MatrixD

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_.

Return the used data matrix x. Mainly for derived classes where x is expanded from the given columns in x_.

Attributes

def getY: VectorD

Return the used response vector y (first colum in matrix).

Return the used response vector y (first colum in matrix).

Attributes

override def getYY: MatrixD

Return the used response matrix y. Mainly for derived classes where y is transformed.

Return the used response matrix y. Mainly for derived classes where y is transformed.

Attributes

Definition Classes

Return the hyper-parameters.

Return the hyper-parameters.

Attributes

def inSample_Test(skip: Int = ..., showYp: Boolean = ...): (VectorD, VectorD)

Perform In-Sample Testing, i.e., train and test on the full data set. Return the prediction and the Quality of Fit.

Perform In-Sample Testing, i.e., train and test on the full data set. Return the prediction and the Quality of Fit.

Value parameters

showYp

whether to show the prediction vector

skip

the number of initial data points to skip (due to insufficient information)

Attributes

def makePlots(yy: MatrixD, yp: MatrixD): Unit

Make plots for each output/response variable (column of matrix y). Must override if the response matrix is transformed or rescaled.

Make plots for each output/response variable (column of matrix y). Must override if the response matrix is transformed or rescaled.

Value parameters

yp

the testing/full predicted response/output matrix (defaults to full y)

yy

the testing/full actual response/output matrix (defaults to full y)

Attributes

def mcols: LinkedHashSet[Int]

Return the set of columns (numbers) for the features in this model.

Return the set of columns (numbers) for the features in this model.

Attributes

def numTerms: Int

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Return the number of terms/parameters in the model, e.g., b_0 + b_1 x_1 + b_2 x_2 has three terms.

Attributes

def orderByY(y_: VectorD, yp_: VectorD): (VectorD, VectorD)

Order vectors y_ and yp_ accroding to the ascending order of y_.

Order vectors y_ and yp_ accroding to the ascending order of y_.

Value parameters

y_

the vector to order by (e.g., true response values)

yp_

the vector to be order by y_ (e.g., predicted response values)

Attributes

Return only the first matrix of parameter/coefficient values.

Return only the first matrix of parameter/coefficient values.

Attributes

Return the array of network parameters (weight matrix, bias vector) bb.

Return the array of network parameters (weight matrix, bias vector) bb.

Attributes

def predict(x_: MatrixD): MatrixD

Predict the value of vector y = f(x_, b), e.g., x_ * b for Regression.

Predict the value of vector y = f(x_, b), e.g., x_ * b for Regression.

Value parameters

x_

the matrix to use for making predictions, one for each row

Attributes

override def report(ftMat: MatrixD): String

Return a basic report on a trained and tested multi-variate model.

Return a basic report on a trained and tested multi-variate model.

Value parameters

ftMat

the matrix of qof values produced by the Fit trait

Attributes

Definition Classes
def resetBest(): Unit

Reset the best-step to default

Reset the best-step to default

Attributes

Return the matrix of residuals/errors.

Return the matrix of residuals/errors.

Attributes

def stepwiseSelAll(cross: String = ..., swap: Boolean = ...)(using qk: Int): (LinkedHashSet[Int], ArrayBuffer[VectorD])

Perform STEPWISE SELECTION to find a GOOD COMBINATION of predictive variables to have in the model, returning the variables selected and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.

Perform STEPWISE SELECTION to find a GOOD COMBINATION of predictive variables to have in the model, returning the variables selected and the new Quality of Fit (QoF) measures for all steps. At each step it calls 'forwardSel' and 'backwardElim' and takes the best of the two actions. Stops when neither action yields improvement.

Value parameters

cross

indicator to include the cross-validation/validation QoF measure (defaults to "many")

qk

index of Quality of Fit (QoF) to use for comparing quality

Attributes

See also

Fit for index of QoF measures.

def test(x_: MatrixD, y_: VectorD): (VectorD, VectorD)

Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classes should implement various diagnostics for the test and full (training + test) datasets.

Test/evaluate the model's Quality of Fit (QoF) and return the predictions and QoF vectors. This may include the importance of its parameters (e.g., if 0 is in a parameter's confidence interval, it is a candidate for removal from the model). Extending traits and classes should implement various diagnostics for the test and full (training + test) datasets.

Value parameters

x_

the testing/full data/input matrix (impl. classes may default to x)

y_

the testing/full response/output vector (impl. classes may default to y)

Attributes

inline def testIndices(n_test: Int, rando: Boolean): IndexedSeq[Int]

Return the indices for the test-set.

Return the indices for the test-set.

Value parameters

n_test

the size of test-set

rando

whether to select indices randomly or in blocks

Attributes

See also

scalation.mathstat.TnT_Split

inline def testIndices(n_total: Int, n_test: Int, rando: Boolean): IndexedSeq[Int]

Return the indices for the test-set for (1) RANDONLY or (3) LAST

Return the indices for the test-set for (1) RANDONLY or (3) LAST

Value parameters

n_test

the size of test-set

n_total

the size of full dataset

rando

whether to select indices randomly or in blocks

Attributes

See also

scalation.mathstat.TnT_Split

def train(x_: MatrixD, y_: VectorD): Unit

Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.

Train the model 'y_ = f(x_) + e' on a given dataset, by optimizing the model parameters in order to minimize error '||e||' or maximize log-likelihood 'll'.

Value parameters

x_

the training/full data/input matrix (impl. classes may default to x)

y_

the training/full response/output vector (impl. classes may default to y)

Attributes

def train2(x_: MatrixD = ..., y_: MatrixD = ...): Unit

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

The train2 method should work like the train method, but should also optimize hyper-parameters (e.g., shrinkage or learning rate). Only implementing classes needing this capability should override this method.

Value parameters

x_

the training/full data/input matrix (defaults to full x)

y_

the training/full response/output matrix (defaults to full y)

Attributes

def trainNtest(x_: MatrixD = ..., y_: MatrixD = ...)(xx: MatrixD = ..., yy: MatrixD = ...): (MatrixD, MatrixD)

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. FIX - currently must override if y is transformed, @see TranRegression

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. FIX - currently must override if y is transformed, @see TranRegression

Value parameters

x_

the training/full data/input matrix (defaults to full x)

xx

the testing/full data/input matrix (defaults to full x)

y_

the training/full response/output matrix (defaults to full y)

yy

the testing/full response/output matrix (defaults to full y)

Attributes

def trainNtest2(x_: MatrixD = ..., y_: MatrixD = ...)(xx: MatrixD = ..., yy: MatrixD = ...): (MatrixD, MatrixD)

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. This version does auto-tuning. FIX - currently must override if y is transformed, @see TranRegression

Train and test the predictive model y_ = f(x_) + e and report its QoF and plot its predictions. This version does auto-tuning. FIX - currently must override if y is transformed, @see TranRegression

Value parameters

x_

the training/full data/input matrix (defaults to full x)

xx

the testing/full data/input matrix (defaults to full x)

y_

the training/full response/output matrix (defaults to full y)

yy

the testing/full response/output matrix (defaults to full y)

Attributes

def validate(rando: Boolean = ..., ratio: Double = ...)(idx: IndexedSeq[Int] = ...): (MatrixD, MatrixD)
def vif(skip: Int = ...): VectorD

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Compute the Variance Inflation Factor (VIF) for each variable to test for multi-collinearity by regressing x_j against the rest of the variables. A VIF over 50 indicates that over 98% of the variance of x_j can be predicted from the other variables, so x_j may be a candidate for removal from the model. Note: override this method to use a superior regression technique.

Value parameters

skip

the number of columns of x at the beginning to skip in computing VIF

Attributes

Inherited methods

def equation: String

Print the model prediction equation in readable form. Override per model.

Print the model prediction equation in readable form. Override per model.

Attributes

Inherited from:
Model
def equationLaTeX: String

Print the model prediction equation in LaTex form. Override per model.

Print the model prediction equation in LaTex form. Override per model.

Attributes

Inherited from:
Model
def getXy: MatrixD

Return the data matrix x concatenated with response vector y.

Return the data matrix x concatenated with response vector y.

Attributes

Inherited from:
Model
inline def modelName: String

Get the model name.

Get the model name.

Attributes

Inherited from:
Model
def qof2Stat(qof: VectorD): Array[Statistic]

Convert QoF results into an array (of size 1) of Statistic for compatibility with the crossValidate method.

Convert QoF results into an array (of size 1) of Statistic for compatibility with the crossValidate method.

Value parameters

qof

the Quality of Fit (QoF) results

Attributes

Inherited from:
Model
def report(ftVec: VectorD): String

Return a basic report on a trained and tested model.

Return a basic report on a trained and tested model.

Value parameters

ftVec

the vector of qof values produced by the Fit trait

Attributes

Inherited from:
Model
def screen(xy: MatrixD, thr1: Double = ..., thr2: Double = ...)(dep: MatrixD = ...): (MatrixD, VectorI)

Screen the x-columns of matrix xy based on the two thresholds, returning the reduced matrix and the column indices/predictor variables selected.

Screen the x-columns of matrix xy based on the two thresholds, returning the reduced matrix and the column indices/predictor variables selected.

Value parameters

dep

the variable/column dependency measure (defaults to correlation)

thr1

the threshold used to compare the predictor x-columns to the y-column only want variables above some minimal dependency level

thr2

the threshold used to compare the predictor x-columns with each other only want variables below some cut-off dependency/collinearity level

xy

the [ x, y ] combined data-response matrix

Attributes

Inherited from:
Model
def selectFeatures(tech: SelectionTech, cross: String = ..., first: Int = ..., swap: Boolean = ...)(using qk: Int): (LinkedHashSet[Int], ArrayBuffer[VectorD])

Perform feature selection to find the most predictive features/variables to have in the model, returning the features/variables added and the new Quality of Fit (QoF) measures/metrics for all steps.

Perform feature selection to find the most predictive features/variables to have in the model, returning the features/variables added and the new Quality of Fit (QoF) measures/metrics for all steps.

Value parameters

cross

indicator to include the cross-validation/validation QoF measure (defaults to "many")

first

first variable to consider for elimination (default (1) assume intercept x_0 will be in any model)

qk

index of Quality of Fit (QoF) to use for comparing quality

swap

whether to allow a swap step (swap out a feature for a new feature in one step)

tech

the feature selection technique to apply

Attributes

See also

Fit for index of QoF measures/metrics.

Inherited from:
FeatureSelection
inline def taskType: TaskType

Get the type of the task performed by model.

Get the type of the task performed by model.

Attributes

Inherited from:
Model

Inherited fields

var modelConcept: URI

The optional reference to an ontological concept

The optional reference to an ontological concept

Attributes

Inherited from:
Model