leaky relu cheatsheet

The leaky relu function is a common activation function used in neural networks. It is also known as the leaky rectified linear unit.

#Getting Started

#inroduction

It is used to convert a real number to a number between 0 and 1.
It is used in classification problems.
https://en.wikipedia.org/wiki/Rectifier_(neural_networks) Wikipedia

#algorithms using leaky relu

Neural Networks
Deep Learning

#formula

x is the input.

#properties

None Linear Activation
Avoid "Dying ReLU" Problem
Fast Computation
Easy to Implement
Can Improve Accuracy
Limited Negative Saturation

#Advantages and Disadvantages

Advantages
- Avoids "Dying ReLU" Problem: Leaky ReLU helps to avoid the "dying ReLU" problem, which can occur when the gradient becomes zero for negative inputs in ReLU. This can happen during training and can cause some neurons to stop learning, leading to degraded performance.
- Faster Convergence: Leaky ReLU can lead to faster convergence during training, especially in deep neural networks with many layers. This is because it provides a non-zero gradient for negative inputs, which allows the weights to be updated during backpropagation.
- Improved Accuracy: In some cases, using Leaky ReLU instead of other activation functions like ReLU or sigmoid can lead to improved accuracy in deep learning models.
- Easy to Implement: Leaky ReLU is easy to implement and can be used in most deep learning frameworks and programming languages. It requires only a small modification to the ReLU function.
Disadvantages
- Limited Negative Saturation: Leaky ReLU has a limited negative saturation, which means that it does not allow the activation to drop below a small negative value. This can be a disadvantage in some cases where a more aggressive activation function is needed to prevent the vanishing gradient problem.
- Not Symmetric: Leaky ReLU is not symmetric around zero, which means that it does not have the same output for positive and negative inputs. This can be a disadvantage in some cases where a symmetric activation function is needed.
- Choosing the Slope: Leaky ReLU requires choosing a slope for the negative inputs, which can be difficult in some cases. This is because the slope can affect the performance of the model, and it is not always clear what the best slope is.

#Implementation

#Python

def leaky_relu(x):
  return np.maximum(0.01*x, x)

#R

leaky_relu <- function(x) {
  return(pmax(0.01*x, x))
}

#Julia

function leaky_relu(x)
  return(maximum(0.01*x, x))
end

#Tensorflow

import tensorflow as tf

x = tf.keras.layers.LeakyReLU(alpha=0.01)

#Pytorch

import torch

x = torch.nn.LeakyReLU(negative_slope=0.01)