Linear Regression with Perceptron Using the PyTorch Library in Python – News Couple
ANALYTICS

Linear Regression with Perceptron Using the PyTorch Library in Python


This article was published as part of the Data Science Blogathon

Linear Regression Overview

“Without understanding the engine, building or working a car is just playing with metal”

This seems to be true in almost all areas of life, without the basics; Creation and innovation simply are not possible. In this guide, we will understand what it is linear regression And how can we implement it using neural networks. The main unit of any neural network – simple or complex – is the neuron. A neural network that contains a single neuron is called “visual sense”. It was founded by Frank Rosenblatt at Cornell Flight Laboratory in 1958. Therefore, it has been around for more than 60 years.

During the current rise in real-world deep learning applications, the use of dense neural networks has grown with respect to sensory range but that does not mean that it is still uncommon. Here, we will look at the theory as well as the code for constructing a perceptron to solve a linear regression problem using PyTorch.

Pytorch is a framework designed and developed by Facebook to easily develop artificial intelligence and machine learning code using tensor computations. It is one of the top 3 frameworks for developing deep learning applications and models. Pytorch is a Python package that offers two high-level features:

  • Tensor computalar to NumPy) with strong support for GPU acceleration.
  • Deep neural networks are based on the tape-based autograd system (one of the methods for calculating automatic gradients).

If you want to read more about Pytorch, here is their official link.

Terms Related to Linear Regression with Perceptron

tensor: Tensor is an array, a multidimensional array that stores data just like any other data structure. The set of stored values ​​can be easily accessed through indexing. For a better reference, think of tensors in a more creative way with this series of Structure Increasing Complexity. scalar -> vector -> matrix -> tensor

More powerful: The process of changing certain values ​​to get a completely better result.

Loss: The difference between actual and expected output is called ‘loss’. The term refers to the value that must be reduced to obtain the best optimized model.

Worker: The input values ​​we already have as data to model are called on the variables. The values ​​have already been defined at the time of training and inference (evaluation/testing).

Weights: The coefficient values ​​associated with the linear equation and optimized during training to reduce loss are called model weights. These along with bias are also known as model parameters.

bias The constant value used in the linear equation to manage the vertical position of the line above the Cartesian plane. Also called the “y-intercept” (the point where the linear regression line will intersect the y-axis)

One of the oldest machine learning and statistical algorithms that sparked the boom in predictive analysis with its simplicity and strong conceptual foundations. Linear regression is a group of words that define themselves. Linear means the continuity of any value in a straight line manner. It should not have any curves or corners. On the other hand, we have regression, which means “return to the previous or less developed state”. We use the word regression because we are looking for regular, reliable patterns that repeat themselves in the long run, and if we were to work with particular values ​​rather than total values, there would be little exact relationship between these values ​​because each value involves a particular circumstance and its own personality. So to figure out a pattern for an approximate prediction of the future, we get to a less developed case and look at the bigger picture instead of focusing on the details.

currently, How do we actually use it in practice?

where x are our inputs (independent variable), W is the coefficient of the independent variable And NS It is called bias (constant).

Linear equation in one variable

The simplest neural network is only one neuron called the Perceptron. The ‘neuron’ here should not be confused with the biological brain cell, although it inspired the naming of this mathematical model. We don’t need to get into too many details here, they will be touched upon in later sections of the article.

a

Perceptron |  Linear regression with Perceptron
Perceptron (single neural network)

image link

The first equation is the same equation used in simple linear regression with one independent variable and another dependent variable where x are our inputs (independent variable), W And NS are transactions. It is responsible for learning the linear behavior of the data. But most of the time it is not enough to rely solely on the linear nature, and an understanding of nonlinearity becomes critical to an accurate and reasonable model. In such a case, we have this second equation present in the neuron. It’s an AKA “activation function” that depends on the type of problem and the methods you’re using. To make a linear model, we don’t need activation functions, so we simply avoid using them.

Code in Pytorch for Linear Regression with Perceptron

Before we start anything, you should know that the python package we use in PyTorch is: ‘torch’. The first and foremost thing for any project is to know the basic libraries and packages that will help you in the successful and smart implementation of the project. In our case, other than flame, we will use numbe for mathematical calculation and Matplotlib for visualization.

1. Import libraries and create dataset

import numpy as np
import matplotlib.pyplot as plt
import torch

The dataset is created using NumPy arrays.

x_train = np.array ([[4.7], [2.4], [7.5], [7.1], [4.3], 
                     [7.8], [8.9], [5.2], [4.59], [2.1], 
                     [8], [5], [7.5], [5], [4],
                     [8], [5.2], [4.9], [3], [4.7], 
                     [4], [4.8], [3.5], [2.1], [4.1]],
                    dtype = np.float32)
y_train = np.array ([[2.6], [1.6], [3.09], [2.4], [2.4], 
                     [3.3], [2.6], [1.96], [3.13], [1.76], 
                     [3.2], [2.1], [1.6], [2.5], [2.2], 
                     [2.75], [2.4], [1.8], [1], [2], 
                     [1.6], [2.4], [2.6], [1.5], [3.1]], 
                    dtype = np.float32)

We created the dataset with some random values ​​as a NumPy array data structure.

Data visualization.

plt.figure(figsize=(8,8))
plt.scatter(x_train, y_train, c="green", s=200, label="Original data")
plt.show()
scatter chart |  Linear regression with Perceptron

2. Data Preparation and Modeling with Pytorch

Now, the next step is to convert these NumPy matrices to PyTorch tensors (referred to above in terminology) because they are the background data structures that will enable all PyTorch functions for further machine learning, and deep learning code. So, let’s do that.

X_train = torch.from_numpy(x_train) 
Y_train = torch.from_numpy(y_train)
print('requires_grad for X_train: ', X_train.requires_grad)
print('requires_grad for Y_train: ', Y_train.requires_grad)

Noticeable: Required_grad is the property that manages the information in the tensor regarding whether or not to calculate the gradient of the tensor during training. Not all tensors with a value of False get to store their gradients for further use.

Modeling in Pytorch

In this article, we create the simplest possible model. This model is only the first equation for the neuron responsible for determining the linearity in the data as mentioned above. The parameters of the model are W1 and b1, which are weight and bias, respectively. These parameters are independent in how they appear after being set by the optimizer during training. On the other hand, we have some hyperparameters, which the developer can control and are specifically used to manage the process and direct it to a more improved direction during training. Let’s look at the criteria first.

w1 = torch.rand(input_size, 
                hidden_size, 
                requires_grad=True)
b1 = torch.rand(hidden_size, 
                output_size, 
                requires_grad=True)

Note that, here we are explicitly declaring that these tensor variables must have their Requied_grad value as True so that their gradient is calculated during training and used by the optimizer to fine-tune them further.

These are some of the hyperparameters used. Different neural network architectures come with different hyperparameters. Here are the ones we’ll be using in this model.

input_size = 1 
hidden_size = 1
output_size = 1 
learning_rate = 0.001

input_size , hidden_size and output_size , all these values ​​are 1 because there is only one neuron. Its value indicated the number of neurons used by its layers. The learning rate as the name implies is the amount of sensitivity that the network assumes before changing the parameters. For example, too high a learning rate will lead to drastic changes in the value and it becomes difficult to reach the optimum result. Similarly, making the model too small makes the model take a very long time to get to the optimum level.

Training refers to the stage in your AI workflow when you allow your model to learn by providing it with data and allowing it to improve on the values ​​responsible for the predictions. This is an iterative process that requires multiple passes of data from the model and incrementally makes it better and more accurate. The training phase is divided into two phases, forward propagation and reverse propagation. To understand the neuronal dynamics, I have broken down the sub-process involved in reverse propagation as different individual steps (steps 2-4). Before heading towards these stages in PyTorch.

You should keep in mind that forward propagation is data that comes from the input and passes through all components in the model, in order to calculate the value of the output. Backpropagation starts with calculating the loss between the expected value and the label value and then optimizing the network parameters on the basis of individually calculated gradients with respect to the previously calculated loss. Here’s how it’s done using PyTorch:

for iter in range(1, 4001):
    y_pred = X_train.mm(w1).clamp(min=0).add(b1)
    loss = (y_pred - Y_train).pow(2).sum() 
    if iter % 100 ==0:
        print(iter, loss.item())
    loss.backward()
    with torch.no_grad():
        w1 -= learning_rate * w1.grad
        b1 -= learning_rate * b1.grad
        w1.grad.zero_()
        b1.grad.zero_()

1. Passage forward:

  • Predict the value of the output Y with the value of the input X using the linear equation. The clamp function is used to bind all the elements in the input into the range [ min, max ]. Make min_value and max_value the min_value and max_value , respectively. The “mm” function is the matrix multiplication. ‘Add’ is adding two values.

2. Finding the Loss:

  • Find the difference between Y_train and Y_pred by squaring the difference and then adding it. The ‘pow’ function is for calculating the exponential of a number. The sum function is used to calculate the sum of all values.

3. For the function call loss_backward():

  • The reverse pass will calculate the loss gradient with respect to all Tensors with “require_grad = true”.
  • After this function call, w1.grad and b1.grad will be two tensors carrying the loss gradient with respect to w1 and b1 respectively.

4. Update weights manually

  • The weights have requirements_grad = true, but we don’t need to track that in ‘autograd’. So we will wrap it in “torch.no_grad”
  • Update the reduced weight by subtracting the doubling of the learning rate and gradients
  • Manually zero the weight gradients after updating the weights to store the most recent values ​​for the next iteration. Use the function “w1.grad.zero_()” or “b1.grad.zero_()” for that.

outputs:

up to 4000 pm

Let’s check the optimized value of W1 and b1:

print ('w1: ', w1)
print ('b1: ', b1)

Obtaining prediction values ​​using weights in a linear equation.

predicted_in_tensor = x_train_tensor.mm(w1).clamp(min=0).add(b1)

Visualize forecast and actual values

plt.figure(figsize=(8, 8))
plt.scatter(x_train, y_train, c="green", s=200, label="Original data")
plt.plot(x_train, predicted, label="Fitted line")
plt.legend()
plt.show()
visualize prediction

So here we have it. A simple linear model built and trained with PyTorch. There are different ways to build a simple linear model, but through this tutorial, you should understand and familiarize yourself with some of the important functions in PyTorch to build a neural network. This time it was a neuron, but those same things can be expanded into something more powerful and better especially by adding an activation function, working with multiple neurons, or both. You can easily configure it into a complete deep neural network. I suggest you try to create multiple neural structures and deepen your understanding using PyTorch.

gargia sharma

B-Tech fourth year student
Specialist in deep learning and data science

For more information, check out my GitHub homepage

LinkedIn

github

The media described in this article is not owned by Analytics Vidhya and is used at the author’s discretion



Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button