Echo
  • Echo Docs
  • Installation
  • echoai.activation
    • Activation Functions
    • PyTorch
    • TensorFlow
    • MegEngine
    • Snippets
  • echoai.Attention
    • CV
      • PyTorch
      • TensorFlow
      • MegEngine
    • NLP
      • PyTorch
      • TensorFlow
      • MegEngine
    • Snippets
  • echoAI.optim
    • Optimizers
    • PyTorch
    • TensorFlow
    • MegEngine
    • Snippets
  • Examples
  • Contributing Guidelines
  • Releases
Powered by GitBook
On this page
  • Mish
  • Swish
  • Aria2
  • ELiSH
  • ISRU
  • NLReLU
  • Soft Clipping
  • Soft Exponential
  • SQNL
  • SReLU
  • FReLU

Was this helpful?

  1. echoai.activation

MegEngine

This page contains details of all activation functions for MegEngine backend supported in Echo.

PreviousTensorFlowNextSnippets

Last updated 4 years ago

Was this helpful?

Mish

echoAI.Activation.m_ops.Mish()

Applies the element-wise function:

Mish(x)=xtanh⁡(softplus(x))\textbf{Mish}(x)=x\tanh(\text{softplus}(x))Mish(x)=xtanh(softplus(x))

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

Swish

echoAI.Activation.m_ops.Swish(eswish = False, swish = True, beta = 1.735, flatten = False)

Allows the following element-wise functions:

Swish(x)=xsigmoid(β1x)\textbf{Swish}(x)=x\text{sigmoid}(\beta_{1} x)Swish(x)=xsigmoid(β1​x)
ESwish(x)=βxsigmoid(x)\textbf{ESwish}(x)=\beta x\text{sigmoid}(x)ESwish(x)=βxsigmoid(x)
SILU(x)=xsigmoid(x)\textbf{SILU}(x)=x\text{sigmoid}(x)SILU(x)=xsigmoid(x)
Flatten T-Swish(x)={xsigmoid(x)if x≥00otherwise\textbf{Flatten T-Swish}(x)= \begin{cases} x\text{sigmoid}(x) & \text{if } x\geq 0\\ 0 & \text{otherwise} \end{cases}Flatten T-Swish(x)={xsigmoid(x)0​if x≥0otherwise​

Parameters:

  • eswish - Uses E-Swish activation function. Default: False.

  • swish - Uses Swish activation function. Default: False.

  • flatten - Uses Flatten T-Swish activation function. Default: False.

  • beta - β\betaβparameter used for E-Swish formulation. Default: 1.375

Note: When eswish, swish and flatten are False, it initializes the SILU activation function by default.

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

References:

Aria2

echoAI.Activation.m_ops.Aria2(beta = 0.5, alpha = 1.0)

Applies the element-wise function:

Aria2(x)=(1+e−β∗x)−α\textbf{Aria2}(x)= {(1+e^{-\beta \ast x})}^{-\alpha}Aria2(x)=(1+e−β∗x)−α

Parameters:

  • beta -β\betaβis the exponential growth rate. Default: 0.5

  • alpha -α\alphaαis a hyper-parameter which has a two-fold effect; it reduces the curvature in 3rd quadrant as well as increases the curvature in first quadrant while lowering the value of activation. Default: 1.0

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

ELiSH

echoAI.Activation.m_ops.Elish(hard = False)

Allows the following element-wise functions:

ELiSH(x)={xsigmoid(x)if x≥0(ex−1)sigmoid(x)otherwise\textbf{ELiSH}(x)= \begin{cases} x\text{sigmoid}(x) & \text{if } x \geq 0\\ (e^{x}-1)\text{sigmoid}(x) & \text{otherwise} \end{cases}ELiSH(x)={xsigmoid(x)(ex−1)sigmoid(x)​if x≥0otherwise​
Hard ELiSH(x)={xmax⁡(0,min⁡(1,(x+1)/2))if x≥0(ex−1)max⁡(0,min⁡(1,(x+1)/2))otherwise\textbf{Hard ELiSH}(x)= \begin{cases} x\max(0, \min(1,(x+1)/2)) & \text{if } x \geq 0\\ (e^{x}-1)\max(0, \min(1,(x+1)/2)) & \text{otherwise} \end{cases}Hard ELiSH(x)={xmax(0,min(1,(x+1)/2))(ex−1)max(0,min(1,(x+1)/2))​if x≥0otherwise​

Parameter:

  • hard - Uses Hard ELiSH activation function. Default: False

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

ISRU

echoAI.Activation.m_ops.ISRU(alpha = 1.0, isrlu = False)

Allows the following element-wise functions:

ISRU(x)=x1+αx2\textbf{ISRU}(x)= \frac{x}{\sqrt{1+\alpha x^{2}}}ISRU(x)=1+αx2​x​
ISRLU(x)={xif x≥0x1+αx2otherwise\textbf{ISRLU}(x)= \begin{cases} x & \text{if } x \geq 0\\ \frac{x}{\sqrt{1+\alpha x^{2}}} & \text{otherwise} \end{cases}ISRLU(x)={x1+αx2​x​​if x≥0otherwise​

Parameters:

  • alpha - hyperparameterα\alphaαcontrols the value to which an ISRLU saturates for negative inputs. Default: 1.0

  • isrlu - Uses ISRLU activation function. Default: False

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

NLReLU

echoAI.Activation.m_ops.NLReLU(beta = 1.0)

Applies the element-wise function:

NLReLU(x)=ln⁡(βmax⁡(0,x)+1.0)\textbf{NLReLU}(x)= \ln(\beta\max(0,x)+1.0)NLReLU(x)=ln(βmax(0,x)+1.0)

Parameters:

  • beta - β\betaβparameter used for NLReLU formulation. Default: 1.0

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

Soft Clipping

echoAI.Activation.m_ops.SoftClipping(alpha = 0.5)

Applies the element-wise function:

Soft Clipping(x)=1αlog⁡(1+eαx1+eα(x−1))\textbf{Soft Clipping}(x)= \frac{1}{\alpha}\log{\big(\frac{1+e^{\alpha x}}{1+e^{\alpha (x-1)}}\big)}Soft Clipping(x)=α1​log(1+eα(x−1)1+eαx​)

Parameters:

  • alpha -α\alphaαhyper-parameter, which determines how close to linear the central region is and how sharply the linear region turns to the asymptotic values. Default: 0.5

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

Soft Exponential

echoAI.Activation.m_ops.SoftExponential(alpha = None)

Applies the element-wise function:

Soft Exponential(x)={−log⁡(1+α(x+α))αif α<0xif α=0eαx−1αif α>0\textbf{Soft Exponential}(x)= \begin{cases} \frac{-\log{(1+\alpha(x + \alpha))}}{\alpha} & \text{if } \alpha < 0\\ x & \text{if } \alpha = 0\\ \frac{e^{\alpha x}-1}{\alpha} & \text{if } \alpha > 0 \end{cases}Soft Exponential(x)=⎩⎨⎧​α−log(1+α(x+α))​xαeαx−1​​if α<0if α=0if α>0​

Parameters:

  • alpha -α\alphaαtrainable hyper-parameter which is initialized to zero by default. Default: None

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

SQNL

echoAI.Activation.m_ops.SQNL()

Applies the element-wise function:

SQNL(x)={1if x>2x−x24if 0≤x≤2x+x24if −2≤x<0−1if x<−2\textbf{SQNL}(x)= \begin{cases} 1 & \text{if } x > 2\\ x - \frac{x^2}{4} & \text{if } 0 \leq x \leq 2\\ x + \frac{x^2}{4} & \text{if } -2 \leq x < 0\\ -1 & \text{if } x < -2 \end{cases}SQNL(x)=⎩⎨⎧​1x−4x2​x+4x2​−1​if x>2if 0≤x≤2if −2≤x<0if x<−2​

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

SReLU

echoAI.Activation.m_ops.SReLU(in_features, parameters = None)

Applies the element-wise function:

SReLU(xi)={tir+air(xi−tir)if xi≥tirxiif tir>xi>tiltil+ail(xi−til)xi≤til\textbf{SReLU}(x_{i})= \begin{cases} t_i^r + a_i^r(x_i - t_i^r) & \text{if } x_i \geq t_i^r\\ x_i & \text{if } t_i^r > x_i > t_i^l\\ t_i^l + a_i^l(x_i - t_i^l) & x_i \leq t_i^l \end{cases}SReLU(xi​)=⎩⎨⎧​tir​+air​(xi​−tir​)xi​til​+ail​(xi​−til​)​if xi​≥tir​if tir​>xi​>til​xi​≤til​​

Parameters:

  • in_features - Shape of the input. Datatype: Tuple

  • parameters - ( tr,tl,ar,alt^r,t^l,a^r,a^ltr,tl,ar,al ) parameters for manual initialization, Default: None. If None is passed, parameters are initialized randomly.

Shape:

  • Input:(N,∗)(\mathbf{N}, \ast)(N,∗)where∗\ast∗means any number of additional dimensions

  • Output:(N,∗)(\mathbf{N}, \ast)(N,∗), same shape as input

Reference:

FReLU

echoAI.Activation.m_ops.FReLU(in_channels)

Applies the element-wise function:

FReLU(x)=max⁡(x,T(x))\textbf{FReLU}(x)= \max(x,\mathbb{T}(x))FReLU(x)=max(x,T(x))

Parameter:

  • in_channels - Number of channels in the input tensor. Datatype: Integer

Shape:

  • Input:(N,C,H,W)(\mathbf{N}, \mathbf{C}, \mathbf{H}, \mathbf{W})(N,C,H,W)whereC\mathbf{C}Cindicates the number of channels.

  • Output:(N,C,H,W)(\mathbf{N}, \mathbf{C}, \mathbf{H}, \mathbf{W})(N,C,H,W), same shape as input

Reference:

Mish: A Self Regularized Non-Monotonic Activation Function
Searching for Activation Functions
E-swish: Adjusting Activations to Different Network Depths
Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
ARiA: Utilizing Richard's Curve for Controlling the Non-monotonicity of the Activation Function in Deep Neural Nets
The Quest for the Golden Activation Function
Improving Deep Learning by Inverse Square Root Linear Units (ISRLUs)
Natural-Logarithm-Rectified Activation Function in Convolutional Neural Networks
Neural Network-Based Approach to Phase Space Integration
A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks
SQNL: A New Computationally Efficient Activation Function
Deep Learning with S-shaped Rectified Linear Activation Units
Funnel Activation for Visual Recognition