SGD

scalation.modeling.autograd.SGD
case class SGD(parameters: IndexedSeq[Variabl], lr: Double, momentum: Double = ...) extends Optimizer

Implements the Stochastic Gradient Descent (SGD) optimization algorithm.

Value parameters

lr

the learning rate used for updating the parameters.

momentum

momentum factor to accelerate convergence (default is 0.0).

parameters

an indexed sequence of model parameters to be optimized.

Attributes

Graph
Supertypes
trait Serializable
trait Product
trait Equals
class Optimizer
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

override def step(): Unit

Performs a single optimization step using the SGD algorithm. For each parameter:

Performs a single optimization step using the SGD algorithm. For each parameter:

  • Updates the velocity using the momentum factor and the current gradient.
  • Updates the parameter data by subtracting the computed velocity.

Attributes

Definition Classes

Inherited methods

def clipGradNorm(maxNorm: Double): Unit

Clip the gradients of all parameters by global norm. Scales gradients so that the total norm ≤ maxNorm. Math: Let g = √(∑_p ‖grad_p‖² ). If g > maxNorm, scale all gradients by (maxNorm / g).

Clip the gradients of all parameters by global norm. Scales gradients so that the total norm ≤ maxNorm. Math: Let g = √(∑_p ‖grad_p‖² ). If g > maxNorm, scale all gradients by (maxNorm / g).

Attributes

Inherited from:
Optimizer
def clipGradValue(minVal: Double, maxVal: Double): Unit

Clip the gradients of all parameters by value (element-wise). Each gradient entry smaller than minVal is set to minVal, and each entry larger than maxVal is set to maxVal.

Clip the gradients of all parameters by value (element-wise). Each gradient entry smaller than minVal is set to minVal, and each entry larger than maxVal is set to maxVal.

Attributes

Inherited from:
Optimizer
def gradNorm: Double

Compute the global L2 norm of all parameter gradients. Math: g = √(∑_p‖grad_p‖² )

Compute the global L2 norm of all parameter gradients. Math: g = √(∑_p‖grad_p‖² )

Attributes

Inherited from:
Optimizer
def productElementNames: Iterator[String]

An iterator over the names of all the elements of this product.

An iterator over the names of all the elements of this product.

Attributes

Inherited from:
Product
def productIterator: Iterator[Any]

An iterator over all the elements of this product.

An iterator over all the elements of this product.

Attributes

Returns

in the default implementation, an Iterator[Any]

Inherited from:
Product
def zeroGrad(): Unit

Reset gradients for all parameters. Typically called before the next forward/backward pass. Only parameters with non-null gradient buffers are updated.

Reset gradients for all parameters. Typically called before the next forward/backward pass. Only parameters with non-null gradient buffers are updated.

Attributes

Inherited from:
Optimizer

Inherited fields

var learningRate: Double

Attributes

Inherited from:
Optimizer