scalation/scalation/scalation.modeling/scalation.modeling.neuralnet/Optimizer_SGD

Optimizer_SGD

scalation.modeling.neuralnet.Optimizer_SGD

The Optimizer-SGD class provides methods to optimize the parameters (weights and biases) of Neural Networks with various numbers of layers. This optimizer implements a Stochastic Gradient Descent algorithm.

Attributes

Graph
Supertypes: trait Optimizer

trait StoppingRule

trait MonitorLoss

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

Given training data x and y for a multi-hidden layer Neural Network, fit the parameter array b, where each b(l) contains a weight matrix and bias vector. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Value parameters

b: the array of parameters (weights & biases) between every two adjacent layers
eta: the initial learning/convergence rate
f: the array of activation function family for every two adjacent layers
x: the m-by-n input matrix (training data consisting of m input vectors)
y: the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Given training data x and y for a 2-layer, multi-output Neural Network, fit the parameter/weight matrix b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the parameter's weights.

Value parameters

bb: the array of parameters (weights & biases) between every two adjacent layers
eta: the initial learning/convergence rate
ff: the array of activation function family for every two adjacent layers
x: the m-by-n input matrix (training data consisting of m input vectors)
y: the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Given training data x and y for a 3-layer Neural Network, fit the parameters (weights and biases) a & b. Iterate over several epochs, where each epoch divides the training set into nB batches. Each batch is used to update the weights.

Value parameters

bb: the array of parameters (weights & biases) between every two adjacent layers
eta: the initial learning/convergence rate
ff: the array of activation function family for every two adjacent layers
x: the m-by-n input matrix (training data consisting of m input vectors)
y: the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Inherited methods

Given training data x and y for a Neural Network, fit the parameters b, returning the value of the lose function and the number of epochs. Find the best learning rate within the interval etaI.

Value parameters

b: the array of parameters (weights & biases) between every two adjacent layers
etaI: the lower and upper bounds of learning/convergence rate
f: the array of activation function family for every two adjacent layers
opti: the array of activation function family for every two adjacent layers
x: the m-by-n input matrix (training data consisting of m input vectors)
y: the m-by-ny output matrix (training data consisting of m output vectors)

Attributes

Inherited from:: Optimizer

Collect the next value for the loss function.

Value parameters

loss: the value of the loss function

Attributes

Inherited from:: MonitorLoss

Freeze layer flayer during back-propogation (should only impact the optimize method in the classes extending this trait). FIX: make abstract (remove ???) and implement in extending classes

Value parameters

flayer: the layer to freeze, e.g., 1 => first hidden layer

Attributes

Inherited from:: Optimizer

Return the best/minumum loss seen.

Attributes

Inherited from:: MonitorLoss

Return a permutation vector generator that will provide a random permutation of index positions for each call permGen.igen (e.g., used to select random batches).

Value parameters

m: the number of data instances
rando: whether to use a random or fixed random number stream

Attributes

Inherited from:: Optimizer

Plot the loss function versus the epoch/major iterations.

Value parameters

optName: the name of optimization algorithm (alt. name of network)

Attributes

Inherited from:: MonitorLoss

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Value parameters

b: the current parameter value (weights and biases)
sse: the current value of cost measure (e.g., sum of squared errors)

Attributes

Inherited from:: StoppingRule

Stop when too many steps have the cost measure (e.g., sse) increasing. Signal a stopping condition by returning the best parameter vector, else null.

Value parameters

b: the current value of the parameter vector
sse: the current value of cost measure (e.g., sum of squared errors)

Attributes

Inherited from:: StoppingRule

Stop when too many steps have the cost measure (e.g., loss) increasing. Signal a stopping condition by returning the best list of parameters, else null.

Value parameters

loss: the current loss value.
params: the current list of Variabl parameters (e.g., weights, biases).
upLimit: the maximum number of consecutive steps allowed without improvement.

Attributes

Returns: A tuple containing (best_params, best_loss) if patience is exceeded, else (null, best_loss).
Inherited from:: StoppingRule

Early stopping with patience. If the loss does not improve (by more than EPSILON) for patience consecutive steps, signal a stopping condition by returning the best parameters and loss.

Value parameters

loss: the current loss value.
params: the current list of Variabl parameters (e.g., weights, biases).
patience: the number of epochs to waitLimit without improvement.

Attributes

Returns: A tuple containing (best_params, best_loss) if patience is exceeded, else (null, best_loss).
Inherited from:: StoppingRule

Inherited fields

Attributes

Inherited from:: StoppingRule

In this article

Generated with