TransformerEnc

scalation.modeling.autograd.TransformerEnc

The TransformerEnc object implements the attention method based on the scaled dot product.

Attributes

Graph
Supertypes
class Object
trait Matchable
class Any
Self type

Members list

Value members

Concrete methods

def attention(q: MatrixD, k: MatrixD, v: MatrixD, d_k: Int): MatrixD

Based on the Query (Q), Key (K), and Value (V) matrices, compute the attention.

Based on the Query (Q), Key (K), and Value (V) matrices, compute the attention.

att = softmax (QK^ᵀ/√d_k) V

Value parameters

d_k

the dimensionality of Query, Key, and Value (if different use d_v)

k

the Key: other locations to compare it with (for similarity)

q

the Query: the input of interest

v

the Value: the input value at the key locations

Attributes

def embed(xx: MatrixD, wE: MatrixD): MatrixD

Use a matrix transformation containing learnable weights to embed each patch vector into a higher dimensional space (providing enhanced vector similarity). The dimensionality of the embedding space is d_model. For this simple implementation d_model = d_k as there is only one attention head.

Use a matrix transformation containing learnable weights to embed each patch vector into a higher dimensional space (providing enhanced vector similarity). The dimensionality of the embedding space is d_model. For this simple implementation d_model = d_k as there is only one attention head.

Value parameters

wE

the dimensionality of the embedding space

xx

the matrix containing each patch as a row

Attributes

def encodePositions(len: Int, d_k: Int): MatrixD

Encode all the positions in the time series as vectors of length d_model.

Encode all the positions in the time series as vectors of length d_model.

Value parameters

d_k

the dimensionality of the model (d_model = d_k here)

len

the sequence length

Attributes

Perform layer normalization on matrix x. The more general affine transformation is not supported in this simple implementation.

Perform layer normalization on matrix x. The more general affine transformation is not supported in this simple implementation.

Value parameters

x

the matrix to normalize

Attributes

def patchify(y: VectorD, pl: Int): MatrixD

Patchify the univariate time series y by breaking it into non-overlapping patches of length pl. This simple implementation assumes stride s = pl, but PatchTST uses pl = 16 and s = 8 as defaults.

Patchify the univariate time series y by breaking it into non-overlapping patches of length pl. This simple implementation assumes stride s = pl, but PatchTST uses pl = 16 and s = 8 as defaults.

Value parameters

pl

the patch length

y

the given univariate time series

Attributes

Concrete fields

val eps: Double