Machine Learning with Python Part.1 ~Perceptron~

What is Perceptron?

Perceptron is one of the simplest types of artificial neural network and invented by Frank Rosenblatt in 1957. A single layer Perceptron is typically used for binary classification problems (1 or 0, Yes or No). The goal of Perceptron is to estimate the parameters that best predict the outcome, given the input features. The optimal parameters should yield the best approximation of decision boundary which separates the input data in two parts. For data that has non-linear decision boundary, more complicated algorithm such as deep learning instead of Perceptron is required.

How the Perceptron works

1. Create and initialize the parameters of the network

In Perceptron, a single layer has two kinds of parameters: weights and biases.
Bias term is an additional parameter which is used to adjust the output along with the weighted sum of the inputs. Bias is a constant and often denoted as $w_0$.

Use random initialization for the weights: \[np.random.randn(shape)*0.01\]
Use zero initialization for the biases: \[np.zeros(shape)\]

2. Multiply weights by inputs and sum them up

Calculate the dot product of weights and inputs, and then add bias term to it. This operating can be done easily by using NumPy, which is the package for scientific computing in Python:
\[np.dots(weights, inputs) + bias\]
This sum value is usually called the input of the activation function, or pre-activation parameter.

3. Apply activation function

The purpose of applying activation function is to convert a input signal of a node to an output signal. In more detail, it is restricting outputs to a certain range or value which enhance the categorization of the data. There are many different types of activation functions. Perceptron uses binary step function, which is a threshold-based function:
\[\phi (z)=\begin{cases} 1 & \text{ if } z>0\\ -1 & \text{ if } z\leq 0 \end{cases}\]

4. Calculate the cost

In Perceptron, the mean squared error (MSE) cost function is used to estimate how badly models are performing. Our goal is to find the parameters that minimizes this cost value.
The formula of MSE is:
\[\frac{1}{2m}\sum_{i=1}^{m}(y_i-\hat{y}_i)^2\]
where $m$: number of examples, $y$: true label vector, $\hat{y}$: output prediction vector.

5. Update the parameters

We need to update weights and biases by using derivative of cost function.
Perceptron uses an optimization algorithm called gradient descent to update the parameters.
Gradient can be computed as follows:
\[dw = \frac{1}{m}np.dot(inputs, y-\hat{y})\]
\[db = \frac{1}{m}np.sum(y-\hat{y})\]
where $dw$: derivative of cost function with respect to weights, $db$: derivative of cost function with respect to bias

Gradient descent algorithm:
\[weights = weights - learning\_rate*dw\]
\[bias = bias - learning\_rate*db\]
where $learning\_rate$: Learning rate of the gradient descent update rule (0 < $\alpha$ < 1)

6. Repeat step 2-5 until the convergence of cost function

7. The complete Perceptron code

30 件

Machine Learning with Python Part.1 ~Perceptron~

What is Perceptron?

How the Perceptron works

1. Create and initialize the parameters of the network

2. Multiply weights by inputs and sum them up

3. Apply activation function

4. Calculate the cost

5. Update the parameters

6. Repeat step 2-5 until the convergence of cost function

7. The complete Perceptron code

関連する記事こんな記事も人気です♪

この記事のキュレーター

週間ランキング

シリーズ３．ImageJマクロ言語を用いた画像解析～②二値化処理-1～

シリーズ３．ImageJマクロ言語を用いた画像解析～①輝度の統計量～

画像解析入門⑦ Image Jによる画像処理

スパースモデリングに基づく画像の再構成 Part2. Total Variation最小化(Split Bregman)に基づく画像再構成

ImageJを使った体積測定

おすすめの記事

細胞種を機械学習で判別する！

人気のキーワード

IMACEL Academy -人工知能・画像解析の技術応用に向けて-| エルピクセル株式会社

Machine Learning with Python Part.1 ~Perceptron~

What is Perceptron?

How the Perceptron works

1. Create and initialize the parameters of the network

2. Multiply weights by inputs and sum them up

3. Apply activation function

4. Calculate the cost

5. Update the parameters

6. Repeat step 2-5 until the convergence of cost function

7. The complete Perceptron code

関連する記事 こんな記事も人気です♪

この記事のキュレーター

週間ランキング

シリーズ３．ImageJマクロ言語を用いた画像解析～②二値化処理-1～

シリーズ３．ImageJマクロ言語を用いた画像解析～①輝度の統計量 ～

画像解析入門⑦ Image Jによる画像処理

スパースモデリングに基づく画像の再構成 Part2. Total Variation最小化(Split Bregman)に基づく画像再構成

ImageJを使った体積測定

おすすめの記事

細胞種を機械学習で判別する！

人気のキーワード

IMACEL Academy -人工知能・画像解析の技術応用に向けて-| エルピクセル株式会社

関連する記事こんな記事も人気です♪

シリーズ３．ImageJマクロ言語を用いた画像解析～①輝度の統計量～