Table of contents
Setup for Pytorch Docker:https://hashnode.com/post/clkyeorvg000709l620tj4poy
Data
Let's create linear data with known parameters.
#Hyperparamers
weight_true = 3.5 and bias_true = 2.5
Create a tensor X starting with 0 and ending with 5.
# Create random input data
X = torch.arange(start=0, end=5, step=0.02).unsqueeze(1)
Y = weight_true * X + bias_true
Y is the prediction. We know the relation between the parameters. The below graph shows the equation of y.
Split
Let's split the data into training and testing. Here we will use 80 % for training and the rest for testing.
def split_train_test(X, Y, train_ratio=0.8):
train_size = int(X.shape[0] * train_ratio)
X_train = X[:train_size]
Y_train = Y[:train_size]
X_test = X[train_size:]
Y_test = Y[train_size:]
print("Length of train data: ", len(X_train))
print("Length of test data: ", len(X_test))
return X_train, Y_train, X_test, Y_test
#split the data into train and test
X_train, Y_train, X_test, Y_test = split_train_test(X, Y, train_ratio=0.8)
Create Model
Let's create a model with two parameters that we want our model to predict. By defining the 'LinearRegression' class in this way, you create a Pytorch module that can be trained to learn the optimal values of 'weights' and 'bias' from the training data to fit a linear relationship between input X and output Y.
The code you provided defines a simple linear regression model using PyTorch. Let's break down the components and their roles:
class LinearRegression(nn.Module)
: This line defines a custom PyTorch module calledLinearRegression
. It inherits fromnn.Module
, which is the base class for all PyTorch models. This custom module will represent our linear regression model.def __init__(self)
: This is the constructor method for theLinearRegression
class. It is called when you create an instance of the class. Inside the constructor,you define the model's parameters and initialize them.super().__init__()
: This line calls the constructor of the parent class (nn.Module
). It's necessary to include this line in your custom module's constructor.self.weights = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float32))
: Here, you define a model parameterweights
usingnn.Parameter
.nn.Parameter
is a wrapper for tensors that tells PyTorch to treat this tensor as a learnable parameter. In simple linear regression,weights
represent the slope of the regression line.torch.randn(1)
initializes it with a random value, andrequires_grad=True
indicates that gradients should be calculated for this parameter during backpropagation.self.bias = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float32))
: Similarly, you define a model parameterbias
representing the y-intercept of the regression line. It's also initialized with a random value and set to be a learnable parameter.
Forward pass: It tells the model what output is expected when we enter the x.
#Build the model
class LinearRegression(nn.Module):
def __init__(self):
super().__init__()
#parameters that need to be learned
self.weights = nn.Parameter(torch.randn(1, requires_grad=True,dtype=torch.float32))
self.bias = nn.Parameter(torch.randn(1, requires_grad=True,dtype=torch.float32))
def forward(self, x):
return x * self.weights + self.bias
Train the model
Let's go through the code line by line
torch.manual_seed(100)
: This line sets the random seed for PyTorch's random number generator. Setting a random seed ensures that the random initialization of model parameters and other random operations are reproducible, meaning you can get the same results when you run the code again with the same seed.model = LinearRegression()
: This line creates an instance of theLinearRegression
model that you defined earlier. It initializes the model with random values for theweights
andbias
.criterion = nn.MSELoss()
: Here, you define the loss function for training the model.nn.MSELoss()
stands for Mean Squared Error loss, which is commonly used in regression tasks. It measures the mean squared difference between the predicted values (Y_pred
) and the actual target values (Y_train
).optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
: You define an optimizer for updating the model's parameters. In this case, you're using stochastic gradient descent (SGD) as the optimizer.model.parameters()
provides the model's parameters (i.e.,weights
andbias
) to be updated during training, andlr
(learning rate) determines the step size for parameter updates.num_epochs = 200
: This variable specifies the number of training epochs, which is the number of times the entire dataset will be used for training.train_loss = []
: This empty list will be used to store the training loss values for each epoch.Training Loop:
for epoch in range(num_epochs)
: This loop iterates through the specified number of training epochs.Y_pred = model(X_train)
: This line computes the model's predictions (Y_pred
) for the training data (X_train
) using the current parameter values.loss = criterion(Y_pred, Y_train)
: Here, you calculate the Mean Squared Error loss between the predicted values (Y_pred
) and the actual training targets (Y_train
).loss.backward()
: This line computes gradients of the loss with respect to the model's parameters, enabling backpropagation for gradient descent.optimizer.step()
: It updates the model's parameters using the gradients computed in the previous step. This is where the actual parameter updates happen.optimizer.zero
_grad()
: After each parameter update, you zero out the gradients to avoid accumulation from previous iterations.print('epoch {}, loss {}'.format(epoch, loss.item()))
: This line prints the current epoch number and the value of the loss for that epoch.train_loss.append(loss.item())
: It appends the loss value for the current epoch to thetrain_loss
list, which you can later use for plotting or analysis.
This training loop continues for the specified number of epochs, and the model's parameters (weights
and bias
) are updated in each iteration to minimize the Mean Squared Error loss. As the training progresses, the loss should decrease, indicating that the model is learning to fit the data. After training, you can use the trained model for making predictions on test data.
# Create a random seed
torch.manual_seed(100)
#create a instance of the model
model = LinearRegression()
#loss function
criterion = nn.MSELoss()
#optimizer
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
#train the model
num_epochs = 200
train_loss = []
for epoch in range(num_epochs):
#forward pass
Y_pred = model(X_train)
#compute loss
loss = criterion(Y_pred, Y_train)
#backward pass
loss.backward()
#update parameters
optimizer.step()
#zero grad before new step
optimizer.zero_grad()
#print loss
print('epoch {}, loss {}'.format(epoch, loss.item()))
#store the loss and plot it
train_loss.append(loss.item())
Loss will decrease as the model is training means that the model understands the linear relationship between the parameters. Model will be trained for 200 epochs. At 0 epoch it will have random values and then it start decreasing the loss and the parameters will be close to what we have defined when we created data.
Epoch 0
Epoch 199
Testing
#test the model
test_loss = []
with torch.inference_mode():
Y_pred = model(X_test)
loss = criterion(Y_pred, Y_test)
#store the loss and plot it
test_loss.append(loss.item())
print('Test loss: ', loss.item())
plot_data(X_test, Y_test, Y_pred.detach(), title='Test Data')
You are using the
model
to make predictions (Y_pred
) on the test data (X_test
).You calculate the loss between the predicted values (
Y_pred
) and the actual target values (Y_test
) using a loss criterion (criterion
).The test loss is appended to the
test_loss
list for later analysis.You print the test loss to see how well your model is performing on the test data. The blue line predicted by the model is getting close to the red line. We can train the model more epochs for better results.
Save the Model
#save the model
torch.save(model.state_dict(), 'model.pth')
print('Saved PyTorch Model State to model.pth')