scalation/scalation/scalation.modeling/scalation.modeling.forecasting/scalation.modeling.forecasting.neuralforecasting/SimpleEncoder

SimpleEncoder

scalation.modeling.forecasting.neuralforecasting.SimpleEncoder

object SimpleEncoder

The SimpleEncoder object implements the attention method based on the scaled dot product.

Attributes

Graph
Supertypes: class Object

trait Matchable

class Any
Self type: SimpleEncoder.type

Members list

Value members

Concrete methods

Based on the Query (Q), Key (K), and Value (V) matrices, compute the attention.

att = softmax (QK^T/√d_k) V

Value parameters

d_k: the dimensionality of Query, Key, and Value (if different use d_v)
k: the Key: other locations to compare it with (for similarity)
q: the Query: the input of interest
v: the Value: the input value at the key locations

Attributes

Use a matrix transformation containing learnable weights to embed each patch vector into a higher dimensional space (providing enhanced vector similarity). The dimensionality of the embedding space is d_model. For this simple implementation d_model = d_k as there is only one attention head.

Value parameters

wE: the dimensionality of the embedding space
xx: the matrix containing each patch as a row

Attributes

Encode all the positions in the time series as vectors of length d_model.

Value parameters

d_k: the dimensionality of the model (d_model = d_k here)
len: the sequence length

Attributes

Perform layer normalization on matrix x. While batch normalization normalizes over a mini-batch, layer normalization normalizes over a single instance or row.

Value parameters

x: the matrix to normalize
β: the shifting learnable parameter (defaults to 0.0)
γ: the scaling learnable parameter (defaults to 1.0)

Attributes

See also: www.geeksforgeeks.org/deep-learning/what-is-layer-normalization/ An affine transformation is supported via the γ and β learnable parameters.

Patchify the univariate time series y by breaking it into non-overlapping patches of length pl. This simple implementation assumes stride s = pl, but PatchTST uses pl = 16 and s = 8 as defaults.

Value parameters

pl: the patch length
y: the given univariate time series

Attributes

Concrete fields

In this article

Generated with