scalation.modeling.autograd

Computes the element-wise absolute value of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The Adam class implements the Adam optimization algorithm for updating model parameters. The Adam optimizer (Kingma & Ba, 2015) with optional L2 weight decay maintains first (m) and second (v) moment estimates and applies bias correction. Classical (non-decoupled) weight decay is applied by adding weightDecay * param to the raw gradient.

Value parameters

beta1: exponential decay rate for the first moment estimates.
beta2: exponential decay rate for the second moment estimates.
eps: small constant added for numerical stability.
lr: base Learning rate for updating the parameters.
parameters: indexed sequence of Variables representing model parameters.
weightDecay: L2 regularization coefficient (0.0 to disable)

Attributes

See also: https://arxiv.org/abs/1412.6980
Note: Call zeroGrad() before backward + step.
Supertypes: trait Serializable

trait Product

trait Equals

class Optimizer

class Object

trait Matchable

class Any
Show all

Computes element-wise addition of two variables.

Value parameters

v1: the first variable.
v2: the second variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Adds a constant value to a variable.

Value parameters

d: the constant to add.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The AutogradOps trait defines the core operations needed for automatic differentiation. It separates the mathematical operations on tensors (TensorD) from the autograd system (Variable, Function), allowing flexible extension. This trait is backed by a default implementation (see AutogradOps.default) using TensorD methods.

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: object default

Companion object for AutogradOps that provides a default implementation.

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: AutogradOps.type

The AutogradTest object contains various @main tests for autograd functionality. The tests validate basic arithmetic, complex expressions, activation functions, loss functions, and neural network layers with backpropagation.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: AutogradTest.type

The BaseModule is a base class for all neural network modules (layers, blocks, models). Provides support for:

Parameter registration
Automatic submodule detection
Gradient management (zeroing)
Training/evaluation mode switching Modules are structured hierarchically: a module can contain submodules.

Value parameters

localParameters: the parameters (Variables) directly belonging to this module

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class GRU

class Module

class LayerNorm

class Linear

class RNN

class SeqModule

class MultiHeadAttention

class ScaledDotProductAttention
Show all

Computes the batched matrix multiplication of two variables.

Value parameters

v1: the first variable.
v2: the second variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the ceil of a variable (element-wise).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Clips the elements of a variable to the range [min, max] (element-wise). Gradient is 1 for elements strictly inside (min, max), 0 for clipped ones (ties get 0.25 via mask product heuristic).

Value parameters

max: upper bound.
min: lower bound.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Represents a concatenation operation on a sequence of variables along a specified axis. This class performs a differentiable concatenation operation during the forward pass and splits the gradient during the backward pass to propagate it to the input variables.

Value parameters

axis: the axis along which to concatenate the variables
vs: the sequence of input variables to concatenate

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

ANSI color codes for colored console output.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: ConsoleColor.type

Computes element-wise division of two variables.

Value parameters

v1: the dividend.
v2: the divisor.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Divides a variable by a constant.

Value parameters

d: the constant divisor.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the dot product of two variables.

Value parameters

v1: the first variable.
v2: the second variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Applies the ELU activation function.

Value parameters

alpha: the ELU scaling parameter.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the exponential of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the floor of a variable (element-wise).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The Function base trait for all differentiable operations in the autograd system. A Function encapsulates both the forward computation (producing outputs) and the backward computation (propagating gradients). It also provides utility methods for handling unbroadcasting of shapes during the backward pass, ensuring correct gradient flow. Every custom operation should extend this trait and implement forward and backward.

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class Abs

class Add

class AddConstant

class BatchMatMul

class Ceil

class Clip

class Concat

class Div

class DivConstant

class Dot

class ELU

class Exp

class Floor

class GRUCellFused

class GeLU

class Identity

class LeakyReLU

class Log

class LogBase

class MAELoss

class MSELoss

class MatMul

class Max

class MaxScalar

class MaxValue

class Mean

class MeanAlongAxis

class Min

class MinScalar

class MinValue

class Mul

class MulConstant

class Neg

class Permute

class Pow

class RNNCellFused

class RNNFused

class ReLU

class Reciprocal

class Reshape

class Round

class SSELoss

class Sigmoid

class Sign

class Slice

class Softmax

class Sqrt

class Std

class StdAlongAxis

class Sub

class SubConstant

class Sum

class Tanh

class Transpose

class Variance

class VarianceAlongAxis
Show all

The GRU object provides a factory method for creating instances of the GRU class.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: GRU.type

The GRU class implements a multi-layer gated recurrent unit (GRU) network. It supports stacked GRU layers, where each layer processes the input sequence and passes its output to the next layer. The class also provides methods for parameter retrieval and forward computation.

Value parameters

hiddenSize: number of features in the hidden state
inputSize: number of features in the input at each time step
numLayers: number of stacked GRU layers (default: 1)

Attributes

See also: https://pytorch.org/docs/stable/generated/torch.nn.GRU.html
Companion: object
Supertypes: class BaseModule

class Object

trait Matchable

class Any

The GRUCell class supports a gated recurrent unit cell: r_t = sigmoid(W_ir * x + b_ir + W_hr * h_{t-1} + b_hr) z_t = sigmoid(W_iz * x + b_iz + W_hz * h_{t-1} + b_hz) n_t = tanh(W_in * x + b_in + r_t ⊙ (W_hn * h_{t-1} + b_hn)) h_t = (1 - z_t) ⊙ n_t + z_t ⊙ h_{t-1} This class defines the parameters and forward computation for a GRU cell.

Value parameters

hiddenSize: number of hidden units
inputSize: number of input features

Attributes

See also: https://pytorch.org/docs/stable/generated/torch.nn.GRUCell.html
Companion: object
Supertypes: class SeqModule

class BaseModule

class Object

trait Matchable

class Any

The GRUCell object provides a factory method for creating instances of the GRUCell class.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: GRUCell.type

The GRUCellFused Function implements a single GRU cell as one fused autograd op. It fuses all gate computations for better performance and fewer autograd nodes. Equations: r_t = sigmoid(W_ir * x + b_ir + W_hr * hPrev + b_hr) z_t = sigmoid(W_iz * x + b_iz + W_hz * hPrev + b_hz) n_t = tanh(W_in * x + b_in + r_t ⊙ (W_hn * hPrev + b_hn)) h_t = (1 - z_t) ⊙ n_t + z_t ⊙ hPrev Shapes: input : (B, I, 1) hidden : (B, H, 1) W_i* : (1, H, I) W_h* : (1, H, H) b_i*,b_h* : (1, H, 1)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Applies the GeLU activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The GradCheck object provides methods the check the agreement between numerically computed gradient those computed using Automatic Differentiation (AD).

Attributes

See also: calculus.Differential
Supertypes: class Object

trait Matchable

class Any
Self type: GradCheck.type

GraphExporter generates a computation graph visualization from a root Variabl. The graph includes variables, functions, dependency edges, tensor shapes, and optional gradient annotations. The resulting graph can be serialized to DOT, Mermaid, or JSON formats for visualization.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: GraphExporter.type

Applies the identity activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Learning Rate Scheduler (LR Scheduler) trait. Defines a generic interface for schedulers that adjust the learning rate during optimization. Concrete implementations may update the learning rate based on iteration count, loss values, or other criteria. Notes: - The parameterless step() is intended for schedulers that adjust learning rate solely based on iteration count. - The step(currentLoss) method is intended for schedulers that adapt learning rate based on the current loss value. - By default, both methods throw UnsupportedOperationException; subclasses must override the method(s) they support.

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class ReduceLROnPlateau

class StepLR

The LayerNorm class implements Layer Normalization as described in: "Layer Normalization" by Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

Value parameters

dModel: the number of features in the input
eps: a small value to avoid division by zero
ops: the autograd operations

Attributes

See also: https://arxiv.org/abs/1607.06450
Supertypes: class Module

class BaseModule

class Object

trait Matchable

class Any

Applies the LeakyReLU activation function.

Value parameters

alpha: the negative slope coefficient.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The Linear companion object for Linear to provide an easier construction API.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: Linear.type

A fully connected linear (affine) layer: output =weight.bmm(input) + bias Computes a linear transformation of the input tensor:

Weight shape: (1, outFeatures, inFeatures)
Bias shape: (1, outFeatures, 1)
Input shape: (batch, inFeatures, 1)
Output shape: (batch, outFeatures, 1) The weight and bias are learnable parameters wrapped in Variabl. Internally uses batched matrix multiplication and broadcasting for bias addition.

Value parameters

inFeatures: the number of input features
outFeatures: the number of output features

Attributes

Companion: object
Supertypes: class Module

class BaseModule

class Object

trait Matchable

class Any

Computes the natural logarithm of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the logarithm of a variable with a specified base.

Value parameters

base: the base for the logarithm.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the Mean Absolute Error (MAE) loss.

Value parameters

pred: the prediction variable.
target: the target variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the Mean Squared Error (MSE) loss.

Value parameters

pred: the prediction variable.
target: the target variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the matrix multiplication of two variables.

Value parameters

v1: the first variable.
v2: the second variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Element-wise maximum of two variables. Gradient flows to the larger input; ties split as 0.5 / 0.5.

Value parameters

v1: first input.
v2: second input.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Element-wise maximum between a variable and a scalar. Gradient is 1 where v > s; 0 where v < s; 0.5 where equal.

Value parameters

s: the scalar.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the maximum value in a variable (reduces to a scalar). Gradient is distributed equally among all elements achieving the max (handles ties).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the mean of all elements in a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the mean of a variable along a specified axis (dimension reduced to size 1).

Value parameters

axis: the axis along which to compute the mean.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Element-wise minimum of two variables. Gradient flows to the smaller input; ties split as 0.5 / 0.5.

Value parameters

v1: first input.
v2: second input.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Element-wise minimum between a variable and a scalar. Gradient is 1 where v < s; 0 where v > s; 0.5 where equal.

Value parameters

s: the scalar.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the minimum value in a variable (reduces to a scalar). Gradient is distributed equally among all elements achieving the min (handles ties).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Standard module for layers that take a single input (e.g., Linear, Conv1D). Defines the abstract forward function for single input.

Value parameters

localParameters: the parameters (Variables) directly belonging to this module

Attributes

Supertypes: class BaseModule

class Object

trait Matchable

class Any
Known subtypes: class LayerNorm

class Linear

Computes element-wise multiplication of two variables.

Value parameters

v1: the first variable.
v2: the second variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Multiplies a variable by a constant.

Value parameters

d: the constant multiplier.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Implements the Multi-Head Attention mechanism, a key component of transformer models. This class performs linear projections of the input tensors, splits them into multiple attention heads, applies scaled dot-product attention to each head, and combines the results into a single output tensor.

Value parameters

dModel: the dimensionality of the model (input and output feature size)
numHeads: the number of attention heads

Attributes

See also: https://arxiv.org/abs/1706.03762 "Attention Is All You Need" by Vaswani et al., 2017.

https://dev-discuss.pytorch.org/t/understanding-multi-head-attention-for-ml-framework-developers/1792 "Understanding Multi-Head Attention for ML Framework Developers"

https://pytorch.org/docs/stable/generated/torch.nn.MultiheadAttention.html PyTorch MultiheadAttention Documentation
Supertypes: class SeqModule

class BaseModule

class Object

trait Matchable

class Any

Computes the negation of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The Optimizer abstract class optimizes model parameters. Notes: - Subclasses implement the specific update rule in step(). - The optimizer assumes that gradients (p.grad) have been computed and accumulated by the autograd engine before each call to step(). - Parameters with null gradients are safely ignored.

Value parameters

learningRate: the step size (η) used for gradient-based updates
parameters: the trainable parameters, each wrapped in a Variabl

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class Adam

class SGD

Permutes axes of a tensor variable according to a specified ordering.

Value parameters

axes: the permutation of axes.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Raises a variable to an integer power.

Value parameters

s: the exponent.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The RNN object provides a factory method for creating instances of the RNN class.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: RNN.type

The RNN class implements a multi-layer recurrent neural network (RNN). It supports stacked RNN layers, where each layer processes the input sequence and passes its output to the next layer. The class also provides methods for parameter retrieval and forward computation.

Value parameters

activation: activation function to use: "tanh" (default) or "relu"
hiddenSize: number of features in the hidden state
inputSize: number of features in the input at each time step
numLayers: number of stacked RNN layers (default: 1)
ops: implicit autograd operations

Attributes

See also: https://pytorch.org/docs/stable/generated/torch.nn.RNN.html
Companion: object
Supertypes: class BaseModule

class Object

trait Matchable

class Any

The RNNCell class supports a simple RNN cell that updates the hidden state: h' = activation(W_ih * x + b_ih + W_hh * h + b_hh) using two biases instead of one.

Value parameters

activation: activation function to use: "tanh" (default) or "relu"
hiddenSize: number of hidden units
inputSize: number of input features

Attributes

See also: https://pytorch.org/docs/stable/generated/torch.nn.RNNCell.html
Companion: object
Supertypes: class SeqModule

class BaseModule

class Object

trait Matchable

class Any

The RNNCell object provides a factory method for creating instances of the RNNCell class.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: RNNCell.type

The RNNCellFused Function implements a single RNN cell as one fused autograd op. It fuses the input/hidden projections and activation into a single node for improved performance and reduced autograd graph size. Equation: h_t = φ(W_ih * x + b_ih + W_hh * hPrev + b_hh) where φ ∈ {tanh, relu} Shapes: input : (B, I, 1) hidden : (B, H, 1) W_ih : (1, H, I) W_hh : (1, H, H) b_ih : (1, H, 1) b_hh : (1, H, 1) The function caches only what is needed for the backward pass:

input and hidden states
pre-activation value
output after activation

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Fused RNN over a whole input sequence (vanilla RNN).

Unrolls the sequence in a single Function.
Returns the last hidden state as the output Variabl.
On backward (), performs full BPTT and accumulates parameter grads. Shapes: input(t): (B, I, 1) hidden: (B, H, 1) // initial hidden (h0) W_ih: (1, H, I) W_hh: (1, H, H) b_ih: (1, H, 1) b_hh: (1, H, 1)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The RNNTestCore object defines a suite of @main entrypoints that exercise the autograd system using recurrent neural network components. These tests verify:

forward computation consistency for RNNCell and GRUCell
correct propagation of hidden states through RNNBase
correctness of gradient backpropagation through time
multilayer RNN/GRU behavior and parameter interaction
construction and export of autograd computation graphs for debugging All tests use synthetic inputs and manually assigned weights/biases to ensure deterministic behavior to validate against PyTorch, enabling reliable gradient-checking via finite differences using GradCheck.gradCheck.

Attributes

Note: This file focuses exclusively on core autograd correctness and does not contain any real-data forecasting experiments.
Supertypes: class Object

trait Matchable

class Any
Self type: RNNTestCore.type

The RNNTestForecasting object provides a suite of time–series utilities and forecasting experiments using Autograd–based recurrent neural networks. It includes:

lagged–window matrix builders (buildMatrix4TS, buildMatrix4TSX)
batch construction utilities for sequence models (makeBatches)
demonstration tests for RNN and GRU models on: • synthetic sequences • COVID–19 new-deaths data • ILI (Influenza-Like Illness) data
chronological train/test splits
rolling / walk–forward validation These tests verify correctness of data pipelines, shape handling, training loops, scaling transformations, and forecasting performance.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: RNNTestForecasting.type

Applies the ReLU activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the reciprocal of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

PyTorch-style ReduceLROnPlateau scheduler. Monitors a metric each epoch and reduces the learning rate when progress plateaus. Supports both "min" (e.g., loss) and "max" (e.g., accuracy) modes, with relative or absolute thresholds for determining improvement. The LR is reduced when the number of non-improving epochs exceeds patience, after which a cooldown period prevents further reductions. Each reduction follows: newLR = max(oldLR * factor, minLR), skipped when the change is too small (≤ eps). Non-finite metric values are ignored. Call step(metric) after each optimizer update. getLastLR returns the most recent learning rate.

Value parameters

cooldown: epochs to wait after a reduction during which bad-epoch counter stays at 0
eps: minimal effective LR change required to apply a reduction
factor: multiplicative decay factor in (0,1), i.e., newLR = oldLR * factor
minLR: lower bound on the learning rate
mode: "min" or "max" (target direction for improvement)
optim: the optimizer whose learning rate will be scheduled
patience: number of non-improving epochs tolerated before reduction (strictly > patience)
threshold: significance threshold (relative or absolute depending on thresholdMode)
thresholdMode: "rel" for relative margin, "abs" for absolute margin
verbose: if true, prints LR reduction messages

Attributes

Supertypes: trait LRScheduler

class Object

trait Matchable

class Any

Reshape operation for a variable. This class represents a differentiable operation that reshapes a tensor variable to a new shape during the forward pass and reshapes the gradient back to the original shape during the backward pass.

Value parameters

newShape: the target shape for the variable
v: the input variable to be reshaped

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the round of a variable (element-wise).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Implements the Stochastic Gradient Descent (SGD) optimization algorithm.

Value parameters

lr: the learning rate used for updating the parameters.
momentum: momentum factor to accelerate convergence (default is 0.0).
parameters: an indexed sequence of model parameters to be optimized.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Optimizer

class Object

trait Matchable

class Any
Show all

Computes the Sum of Squared Errors (SSE) loss.

Value parameters

pred: the prediction variable.
target: the target variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Implements the Scaled Dot-Product Attention mechanism. This class is a sequence module that computes the attention scores and applies them to the value tensor (v) based on the query (q) and key (k) tensors. It is a fundamental building block for transformer models.

Attributes

See also: https://arxiv.org/abs/1706.03762 "Attention Is All You Need" by Vaswani et al., 2017.

https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.scaled_dot_product_attention.html
Supertypes: class SeqModule

class BaseModule

class Object

trait Matchable

class Any

Module for layers that take multiple inputs (e.g., RNN cells, attention blocks). Defines the abstract forward function for sequence or multiple inputs.

Value parameters

localParameters: the parameters (Variables) directly belonging to this module

Attributes

Supertypes: class BaseModule

class Object

trait Matchable

class Any
Known subtypes: class MultiHeadAttention

class ScaledDotProductAttention

Applies the sigmoid activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Applies the sign function element-wise. Derivative is zero almost everywhere (undefined at zero).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Represents a slicing operation on a tensor variable. This class performs a differentiable slicing operation during the forward pass and propagates the gradient to the sliced region during the backward pass.

Value parameters

r0: the range for the first dimension
r1: the range for the second dimension
r2: the range for the third dimension
v: the input variable to be sliced

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Applies the softmax activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the square root of a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Enumeration representing the status of a test: - Passed - Failed

Attributes

Supertypes: trait Enum

trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Computes the standard deviation of all elements in a variable. Std(x) = sqrt(Var(x)); derivative ds/dx = (x - mean)/(N * std).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the standard deviation of a variable along a specified axis.

Value parameters

axis: the axis along which to compute std.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Step-based learning rate scheduler. Reduces the optimizer's learning rate by multiplying with gamma every stepSize epochs. Matches the behavior of PyTorch's StepLR for the single-LR (non–param-group) setting.

Value parameters

gamma: the multiplicative decay factor applied every step
optim: the optimizer whose learning rate will be scheduled
stepSize: the interval (in epochs) between LR reductions

Attributes

Supertypes: trait LRScheduler

class Object

trait Matchable

class Any

Computes element-wise subtraction of two variables.

Value parameters

v1: the minuend.
v2: the subtrahend.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Subtracts a constant value from a variable.

Value parameters

d: the constant to subtract.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the sum of all elements in a variable.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Applies the tanh activation function.

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The TensorInitializers utility object for tensor initializations commonly used in neural networks. Provides methods to create tensors filled with zeros, ones, random values, and standardized initialization schemes like He and Xavier initialization. All returned tensors have batch-first shape: (batch, rows, cols).

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: TensorInitializers.type

Companion object for creating TestReport instances.

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: TestReport.type

A test report utility for recording and summarizing test results. Stores a collection of TestResult objects and provides support for timing tests, capturing failures, and printing formatted summary reports.

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Result container for a single test execution.

Value parameters

ms: the execution time in milliseconds
name: the name of the test
note: optional note or error message (default empty)
status: the status of the test (Passed or Failed)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

The TransformerEnc object implements the attention method based on the scaled dot product.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: TransformerEnc.type

The TransformerTestCoretests theTransformer` class.

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: TransformerTestCore.type

Transposes (swaps) two axes of a tensor variable.

Value parameters

i: first axis index.
j: second axis index.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

The Variabl case class represents a tensor with automatic differentiation capability. It tracks operations applied to it for backward gradient propagation. Variabls can be combined using arithmetic operations, activation functions, and loss functions. Backpropagation is triggered via the backward method.

Value parameters

data: the tensor data for this variable.
gradFn: an optional function for backpropagation.
name: an optional name for this variable.
ops: the implicit autograd operations for tensor computations.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Computes the variance of all elements in a variable (population variance). Uses definition Var(x) = mean((x - mean(x))^2).

Value parameters

v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Computes the variance of a variable along a specified axis (population variance).

Value parameters

axis: the axis along which to compute variance.
v: the input variable.

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

trait Function

class Object

trait Matchable

class Any
Show all

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Provides an implicit conversion from a Module to a function that maps a Variabl to a Variabl. This allows using a Module directly as a function.

Attributes

Supertypes: class Conversion[Module, Variabl => Variabl]

trait Module => Variabl => Variabl

class Object

trait Matchable

class Any
Self type: given_Conversion_Module_Function.type

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

Attributes

Supertypes: class Object

trait Matchable

class Any

scalation.modeling.autograd

Members list

Type members

Classlikes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes

Value parameters

Attributes