scalation/scalation/scalation.modeling/scalation.modeling.forecasting/scalation.modeling.forecasting.neuralforecasting/TrfEncoderLayer

TrfEncoderLayer

scalation.modeling.forecasting.neuralforecasting.TrfEncoderLayer

class TrfEncoderLayer(x: MatrixD, heads: Int = ..., f: AFF = ..., initW: Array[MatrixD] = ..., p_drop: Double = ..., norm_eps: Double = ..., norm_first: Boolean = ...) extends Attention

The TrfEncoderLayer class consists of a Multi-Head Self-Attention and a Feed-Forward Neural Network (FFNN) sub-layers.

Value parameters

f: the activation function family (used by alinear1)
heads: the number of attention heads (e.g., 1 to 8)
norm_eps: a small values used in normalization to avoid divide by zero
norm_first: whether layer normalization should be done first (see apply method)
p_drop: the probability of setting an element to zero in a dropout layer (e.g., .0 to .5)
x: the input data matrix after embedding (number of rows/instances by embedding dimension)

Attributes

See also: pytorch.org/docs/stable/generated/torch.nn.TransformerEncoderLayer.html#torch.nn.TransformerEncoderLayer
Graph
Supertypes: trait Attention

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Compute the Feed-forward Neural Network result.

Value parameters

x: the input matrix

Attributes

Forward pass: Compute this encoder layer's result z by using Multi-Head Self-Attention followed by a Feed-Forward Neural Network.

Attributes

Compute the Multi-Head Self-Attention result.

Value parameters

x: the input matrix

Attributes

Inherited methods

Compute a Self-Attention Weight Matrix from the given query (Q), key (K) and value (V).

Value parameters

k: the key matrix K
q: the query matrix Q (q_t over all time)
v: the value matrix V

Attributes

Inherited from:: Attention

Compute a Multi-Head, Self-Attention Weight Matrix by taking attention for each head and concatenating them; finally multiplying by the overall weight matrix w_o. The operator ++^ concatenates matrices column-wise.

Value parameters

k: the key matrix K
q: the query matrix Q (q_t over all time)
v: the value matrix V
w_o: the overall weight matrix to be applied to concatenated attention
w_q: the weight tensor for query Q (w_q(i) matrix for i-th head)
w_v: the weight tensor for value V (w_v(i) matrix for i-th head)

Attributes

Inherited from:: Attention

Compute a Context Vector from the given query at time t (q_t), key (K) and value (V).

Value parameters

k: the key matrix K
q_t: the query vector at time t (based on input vector x_t)
v: the value matrix V

Attributes

Inherited from:: Attention

Compute the Query, Key, Value matrices from the given input and weight matrices.

Value parameters

w_q: the weight matrix for query Q
w_v: the weight matrix for value V
x: the input matrix

Attributes

Inherited from:: Attention

Concrete fields

Inherited fields

Attributes

Inherited from:: Attention

Attributes

Inherited from:: Attention

Attributes

Inherited from:: Attention

In this article

Generated with