scalation/scalation/scalation.modeling/scalation.modeling.autograd/TransformerEnc

TransformerEnc

scalation.modeling.autograd.TransformerEnc

object TransformerEnc

The TransformerEnc object implements the attention method based on the scaled dot product.

Attributes

Graph
Supertypes: class Object

trait Matchable

class Any
Self type: TransformerEnc.type

Members list

Value members

Concrete methods

Based on the Query (Q), Key (K), and Value (V) matrices, compute the attention.

att = softmax (QK^ᵀ/√d_k) V

Value parameters

d_k: the dimensionality of Query, Key, and Value (if different use d_v)
k: the Key: other locations to compare it with (for similarity)
q: the Query: the input of interest
v: the Value: the input value at the key locations

Attributes

Use a matrix transformation containing learnable weights to embed each patch vector into a higher dimensional space (providing enhanced vector similarity). The dimensionality of the embedding space is d_model. For this simple implementation d_model = d_k as there is only one attention head.

Value parameters

wE: the dimensionality of the embedding space
xx: the matrix containing each patch as a row

Attributes

Encode all the positions in the time series as vectors of length d_model.

Value parameters

d_k: the dimensionality of the model (d_model = d_k here)
len: the sequence length

Attributes

Perform layer normalization on matrix x. The more general affine transformation is not supported in this simple implementation.

Value parameters

x: the matrix to normalize

Attributes

Patchify the univariate time series y by breaking it into non-overlapping patches of length pl. This simple implementation assumes stride s = pl, but PatchTST uses pl = 16 and s = 8 as defaults.

Value parameters

pl: the patch length
y: the given univariate time series

Attributes

Concrete fields

In this article

Generated with