ScaledDotProductAttention
Implements the Scaled Dot-Product Attention mechanism. This class is a sequence module that computes the attention scores and applies them to the value tensor (v) based on the query (q) and key (k) tensors. It is a fundamental building block for transformer models.
Attributes
- See also
-
https://arxiv.org/abs/1706.03762 "Attention Is All You Need" by Vaswani et al., 2017.
- Graph
-
- Supertypes
Members list
Value members
Concrete methods
Forward pass for the Scaled Dot-Product Attention module. This method takes three input tensors (q, k, v), computes the attention scores, and applies them to the value tensor.
Forward pass for the Scaled Dot-Product Attention module. This method takes three input tensors (q, k, v), computes the attention scores, and applies them to the value tensor.
Value parameters
- inputs
-
an
IndexedSeqcontaining the query (q), key (k), and value (v) tensors
Attributes
- Returns
-
an
IndexedSeqcontaining the resulting attention tensor - Throws
-
IllegalArgumentException
if the number of inputs is not 3
- Definition Classes
Inherited methods
Alias for forward, allows calling the module as a function: module(xs).
Alias for forward, allows calling the module as a function: module(xs).
Attributes
- Inherited from:
- SeqModule
Set the module to evaluation mode (and all submodules recursively).
Set the module to evaluation mode (and all submodules recursively).
Attributes
- Inherited from:
- BaseModule
Return the gradients of all parameters.
Return all trainable parameters, including those from submodules.
Return all trainable parameters, including those from submodules.
Attributes
- Inherited from:
- BaseModule
Replace the current parameters with new ones. Useful for weight updates, loading saved models, etc.
Replace the current parameters with new ones. Useful for weight updates, loading saved models, etc.
Value parameters
- newParams
-
The new parameter list to assign
Attributes
- Inherited from:
- BaseModule
Set the module to training mode (and all submodules recursively).
Set the module to training mode (and all submodules recursively).
Attributes
- Inherited from:
- BaseModule
Zero out all gradients (in-place).
Inherited fields
Flag to control training or evaluation behavior.
Automatically detect submodules (other BaseModules) within this module.
Automatically detect submodules (other BaseModules) within this module.
Attributes
- Inherited from:
- BaseModule