Optimizer_SGD

scalation.modeling.neuralnet.Optimizer_SGD
class Optimizer_SGD extends Optimizer

The Optimizer-SGD class provides methods to optimize the parameters (weights and biases) of Neural Networks with various numbers of layers. This optimizer implements a Stochastic Gradient Descent algorithm.

Attributes

Graph
Supertypes
trait Optimizer
trait StoppingRule
trait MonitorLoss
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def optimize(x: MatrixD, y: MatrixD, b: NetParams, eta: Double, f: Array[AFF]): (Double, Int)

Given training data x and y for a multi-hidden layer Neural Network, fit the parameter array b, where each b(l) contains a weight matrix and bias vector. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Given training data x and y for a multi-hidden layer Neural Network, fit the parameter array b, where each b(l) contains a weight matrix and bias vector. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Value parameters

b

the array of parameters (weights & biases) between every two adjacent layers

eta

the initial learning/convergence rate

f

the array of activation function family for every two adjacent layers

x

the m-by-n input matrix (training data consisting of m input vectors)

y

the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

def optimize2(x: MatrixD, y: MatrixD, bb: NetParams, eta: Double, ff: Array[AFF]): (Double, Int)

Given training data x and y for a 2-layer, multi-output Neural Network, fit the parameter/weight matrix b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the parameter's weights.

Given training data x and y for a 2-layer, multi-output Neural Network, fit the parameter/weight matrix b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the parameter's weights.

Value parameters

bb

the array of parameters (weights & biases) between every two adjacent layers

eta

the initial learning/convergence rate

ff

the array of activation function family for every two adjacent layers

x

the m-by-n input matrix (training data consisting of m input vectors)

y

the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

def optimize3(x: MatrixD, y: MatrixD, bb: NetParams, eta: Double, ff: Array[AFF]): (Double, Int)

Given training data x and y for a 3-layer Neural Network, fit the parameters (weights and biases) a & b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Given training data x and y for a 3-layer Neural Network, fit the parameters (weights and biases) a & b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Value parameters

bb

the array of parameters (weights & biases) between every two adjacent layers

eta

the initial learning/convergence rate

ff

the array of activation function family for every two adjacent layers

x

the m-by-n input matrix (training data consisting of m input vectors)

y

the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Inherited methods

def auto_optimize(x: MatrixD, y: MatrixD, b: NetParams, etaI: (Double, Double), f: Array[AFF], opti: (MatrixD, MatrixD, NetParams, Double, Array[AFF]) => (Double, Int)): (Double, Int)

Given training data x and y for a Neural Network, fit the parameters b, returning the value of the lose function and the number of epochs. Find the best learning rate within the interval etaI.

Given training data x and y for a Neural Network, fit the parameters b, returning the value of the lose function and the number of epochs. Find the best learning rate within the interval etaI.

Value parameters

b

the array of parameters (weights & biases) between every two adjacent layers

etaI

the lower and upper bounds of learning/convergence rate

f

the array of activation function family for every two adjacent layers

opti

the array of activation function family for every two adjacent layers

x

the m-by-n input matrix (training data consisting of m input vectors)

y

the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Inherited from:
Optimizer
def collectLoss(loss: Double): Unit

Collect the next value for the loss function.

Collect the next value for the loss function.

Value parameters

loss

the value of the loss function

Attributes

Inherited from:
MonitorLoss
def freeze(flayer: Int): Unit

Freeze layer flayer during back-propogation (should only impact the optimize method in the classes extending this trait). FIX: make abstract (remove ???) and implement in extending classes

Freeze layer flayer during back-propogation (should only impact the optimize method in the classes extending this trait). FIX: make abstract (remove ???) and implement in extending classes

Value parameters

flayer

the layer to freeze, e.g., 1 => first hidden layer

Attributes

Inherited from:
Optimizer
def getBestLoss: Double

Return the best/minumum loss seen.

Return the best/minumum loss seen.

Attributes

Inherited from:
MonitorLoss
def permGenerator(m: Int, rando: Boolean = ...): PermutedVecI

Return a permutation vector generator that will provide a random permutation of index positions for each call permGen.igen (e.g., used to select random batches).

Return a permutation vector generator that will provide a random permutation of index positions for each call permGen.igen (e.g., used to select random batches).

Value parameters

m

the number of data instances

rando

whether to use a random or fixed random number stream

Attributes

Inherited from:
Optimizer
def plotLoss(optName: String): Unit

Plot the loss function versus the epoch/major iterations.

Plot the loss function versus the epoch/major iterations.

Value parameters

optName

the name of optimization algorithm (alt. name of network)

Attributes

Inherited from:
MonitorLoss
def stopWhen(b: NetParams, sse: Double): (NetParams, Double)

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Value parameters

b

the current parameter value (weights and biases)

sse

the current value of cost measure (e.g., sum of squared errors)

Attributes

Inherited from:
StoppingRule
def stopWhen(b: VectorD, sse: Double): (VectorD, Double)

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Value parameters

b

the current value of the parameter vector

sse

the current value of cost measure (e.g., sum of squared errors)

Attributes

Inherited from:
StoppingRule
def stopWhenContinuous(params: IndexedSeq[Variabl], loss: Double, upLimit: Int): (IndexedSeq[Variabl], Double)

Stop when too many steps have the cost measure (e.g., loss) increasing. Signal a stopping condition by returning the best list of parameters, else null.

Stop when too many steps have the cost measure (e.g., loss) increasing. Signal a stopping condition by returning the best list of parameters, else null.

Value parameters

loss

the current loss value.

params

the current list of Variabl parameters (e.g., weights, biases).

upLimit

the maximum number of consecutive steps allowed without improvement.

Attributes

Returns

A tuple containing (best_params, best_loss) if patience is exceeded, else (null, best_loss).

Inherited from:
StoppingRule
def stopWhenPatience(params: IndexedSeq[Variabl], loss: Double, patience: Int): (IndexedSeq[Variabl], Double)

Early stopping with patience. If the loss does not improve (by more than EPSILON) for patience consecutive steps, signal a stopping condition by returning the best parameters and loss.

Early stopping with patience. If the loss does not improve (by more than EPSILON) for patience consecutive steps, signal a stopping condition by returning the best parameters and loss.

Value parameters

loss

the current loss value.

params

the current list of Variabl parameters (e.g., weights, biases).

patience

the number of epochs to waitLimit without improvement.

Attributes

Returns

A tuple containing (best_params, best_loss) if patience is exceeded, else (null, best_loss).

Inherited from:
StoppingRule

Inherited fields

protected val EPSILON: Double

Attributes

Inherited from:
StoppingRule