What is logistic regression?
Logistic regression is a classification algorithm used to assign data into a discrete set of classes. There are multiple types of logistic regression: binary(yes/no, pass/fail), multi(cats/dogs/rats), and ordinal(small, medium, high). The purpose of binary logistic regression is similar to that of Perceptron, but there is a key difference: activation function.
Activation function for Perceptron: Binary step function
$$\phi (z)=\begin{cases} 1 & \text{ if } z>0\\ -1 & \text{ if } z\leq 0 \end{cases}$$
Activation function for logistic regression: Sigmoid function
$$\phi(z)=\frac{1}{1+e^{-z}}$$
Let's graph this function using Numpy and Matplotlib.
Activation function for Perceptron: Binary step function
$$\phi (z)=\begin{cases} 1 & \text{ if } z>0\\ -1 & \text{ if } z\leq 0 \end{cases}$$
Activation function for logistic regression: Sigmoid function
$$\phi(z)=\frac{1}{1+e^{-z}}$$
Let's graph this function using Numpy and Matplotlib.
import matplotlib.pyplot as plt import numpy as np def sigmoid(z): return 1.0/ (1.0 + np.exp(-z)) x = np.arange(-7,7,0.1) plt.plot(x,sigmoid(x))
sigmoid_plot.py
As you can see from this graph, it exists between 0 and 1. Since probability of anything exists between the range of 0 and 1, Sigmoid function is used to predict the probability as an output.
For the binary classification problem, we set 0.5 as a threshold and define the output as:
$$\hat{y}=\begin{cases}1 & \text{ if } \phi(z)\geq 0.5 \\ 0 & \text{ if } \phi(z)< 0.5 \end{cases}$$
For the binary classification problem, we set 0.5 as a threshold and define the output as:
$$\hat{y}=\begin{cases}1 & \text{ if } \phi(z)\geq 0.5 \\ 0 & \text{ if } \phi(z)< 0.5 \end{cases}$$
How the logistic regression works?
Some steps are very similar to the Perceptron algorithm, so please refer to the below link if you are interested in the Perceptron algorithm or would like to go over some basic concepts of machine learning.
1. Create and initialize the parameters of the network
2. Multiply weights by inputs and sum them up
3. Apply activation function
As already explained, sigmoid function instead of binary step function is used for logistic regression.
def sigmoid(self, z: np.array) -> np.array: """Sigmoid function. :param x: input of the activation function: z :return: output of the sigmoid function """ return 1/(1+np.exp(-z))
sigmoid_function.py
4. Calculate the cost
We will still use the cost function to calculate the model's performance, but instead of using the mean squared error, we will use the cross-entropy loss function. The loss will increase as the predicted probability diverges from the actual label.
$$-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}log(\phi(z))+(1-y^{(i)})log(1-\phi(z)))$$
where $m$: number of examples, $y$: true label vector, $\phi(z)$: output prediction vector.
$$-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}log(\phi(z))+(1-y^{(i)})log(1-\phi(z)))$$
where $m$: number of examples, $y$: true label vector, $\phi(z)$: output prediction vector.