Transfer learning with TensorFlow – Analytics Vidhya – News Couple
ANALYTICS

Transfer learning with TensorFlow – Analytics Vidhya


This article was published as part of the Data Science Blogathon.

Brief introduction to transfer learning

The most pervasive problem in machine learning relates to data: it may be either insufficient or of low quality. One obvious solution to this set of problems is getting more and better data. However, these two do not meet very often. We have to sacrifice quality for quantity or vice versa. Fortunately, there is a more innovative solution: transfer learning.

Transfer learning is a way of reusing an already trained model for another task. The original training step is called pre-training. The general idea is that pre-training “teaches” the model more general features, while the final training phase “learns” that they feature our (limited) data.

Transfer learning is particularly useful in areas such as medicine, where lack of data remains a perennial problem. Several CNN models pre-trained on ImageNet data have been shown to be successful in various medical tasks [7]. All it takes is a few lines of code to transfer it to the medical data.

In this article, we will learn how to do that using TensorFlow, the world’s most used deep learning platform (as of 2021). Before we delve into the code, let’s have a quick summary of TensorFlow and the Keras API that powers it.

Tensorflow and Hard API

TensorFlow is a comprehensive platform that allows building and publishing ML models. We are only interested in building models, not publishing them, and for this, we need to use Keras. Keras is an API designed for “humans, not machines,” as they put it themselves. This means that Keras is designed for programmers like us who want to build custom models. Its simple and easy-to-remember formula makes it almost addictive.

While Keras API is available as a standalone Python library, it is also available as part of the TensorFlow library. It is recommended to use tensorflow.keras via Keras itself, as it is maintained by the TensorFlow team, ensuring consistency with other TensorFlow modules.

Case Study: Classification of Binary Images

Transfer learning with TensorFlow - a case study

As a first example, we will try to classify binary images. Our dataset will be a Hot Dog – not a Hot Dog from Kaggle [6] And we will try to predict – you guessed it – whether the image presented is a sausage or not.

For this, we will use the ResNet50 model that was previously trained on the ImageNet dataset. ResNet refers to a set of architectures that use residual connections to solve the degradation problem – that is, resolution degradation.

ResNet50 - Transfer learning with TensorFlow

The figure above depicts the remaining maps. This connection skips one (or more) layers and sets the identity, F(x) + x. This slight modification of the network architecture has been a huge success against the degradation problem [8]. As a result, ResNet architectures can reach a depth of 1,000 layers. Our specific choice of model, ResNet50 is a relatively shallow example. You can see its overall structure in the following figure:

Image Input - Transfer Learning Using TensorFlow

There are alternatives to the ResNet family: MobileNets, Inception, etc. are also proven to be successful in image classification. You can also choose one of these, or a completely different network and do transfer learning on that.

I’ll be working on Google Colab, which I’d recommend to anyone whose computer isn’t doing the job, although it’s not a strict requirement. You can run the code in any environment of your choice, including Jupyter Notebook or PyChram.

Let’s go through the process step by step.

Note: This step may vary depending on your preferred environment.

# Upload the kaggle API key
from google.colab import files
files.upload()
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json
# Install the kaggle package
! pip install -q kaggle
# Download the dataset from Kaggle
! kaggle datasets download -d dansbecker/hot-dog-not-hot-dog
# Import the necessary packages
import tensorflow as tf
from tensorflow import keras
from PIL import Image
import os
import numpy as np

Load data for transfer learning with TensorFlow

# Unzip the downloaded zip file
!unzip /content/hot-dog-not-hot-dog.zip
# Let's check size of images
for image in list(os.walk("/content/train/not_hot_dog"))[0][2]:
  a = Image.open(f"/content/train/not_hot_dog/image")
  print(np.asarray(a).shape)
Load the data

This is only part of the output, but we can already see that the image sizes are not fixed. ImageDataGenerator It deals with this kind of problem, among many other things.

Image data is basically a set of numbers. Color images are represented by a set of three two-dimensional matrices. Each of these arrays consists of values ​​between 0 and 255 (this may vary). Three of these values ​​combined (each from a single array) represent the coIour of a pixel. In our case, our images have the form (512, 512, 3). This means we have 512 * 512 = 262,144 pixels and 3 channels. (As we said earlier, not all of them fit 512*512 size, but we will deal with it.)

# Create ImageDataGenerator objects
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator()
test_datagen = tf.keras.preprocessing.image.ImageDataGenerator()
# Assign the image directories to them
train_data_generator = train_datagen.flow_from_directory(
    "/content/train",
    target_size=(512,512)
)
test_data_generator = train_datagen.flow_from_directory(
    "/content/test",
    target_size=(512,512)
)

ImageDataGenerator Object data in batches to our model when necessary. This allows us to work directly with the data stored on the hard drive, without overloading the RAM. train_data_generator and test_data_generator will be passed as arguments to the x and validation_data parameters respectively. where ImageDataGenerator It gets the classes from the folder names, we don’t need the y parameter. (If you try to pass an argument to y, Python will err.)

Now that we have our train and test data set, we can build and train our model.

First, we will upload the Keras application for the ResNet50 model.

resnet_50 = tf.keras.applications.resnet50.ResNet50(include_top=False, weights="imagenet")
resnet_50.trainable=False

include_top = false Ensures that the last layer of the ResNet50 model is not loaded. weights = ‘imagenet’ Loads ImageNet weights. If we put Weights = none, then the weights are randomly initialized (in this case, we won’t perform transfer learning). by selecting trainable attribute to False, we guarantee that the original weights (ImageNet) of the model will remain constant.

We need a binary classifier, but ResNet50 has more than two nodes in the final layers. This means that we have to add the final layer manually. I used a functional API, which can be tricky if you are a novice user of TensorFlow. (In this case, I would suggest you use the Serial API, which has a more straightforward syntax.)

inputs = keras.Input(shape=(512,512,3))
x = resnet_50(inputs)
x = keras.layers.GlobalAveragePooling2D()(x)
outputs = keras.layers.Dense(2, activation="softmax")(x)
model = keras.Model(inputs=inputs, outputs=outputs, name="my_model")
model.compile(optimizer="Adam", loss="binary_crossentropy", metrics=["accuracy"])
model.summary()

In these lines, we define our input, and pass it to resnet_50 The model we defined earlier, pass its output to the global average pool layer, and then pass its output to a dense layer with nodes (of two classes). The activation function should be softmax in this case. The sum of the values ​​of the softmax output vector is always 1. For two nodes (each node represents a class), we have x1 +2 = 1 where x1 and x2 represent class probabilities. (Otherwise, we could have 1 node and the sigmoid activation function). After all this, we need to compile the model by choosing an optimizer and a loss function. We can also add metrics that need to be measured during the training process. Finally, we can train our model.

model.fit(train_data_generator, validation_data=test_data_generator, epochs=5)
Model Building

We’ve finished moving the learning part. Optionally, you can adjust the model for better results.

Final notes

In this article, we have learned how to implement learning transfer with the help of TensorFlow. Transfer learning is a powerful approach that allows us to overcome the lack of data. However, it is not a silver bullet. There are cases when working with whatever data we have it makes more sense and leads to better results. And it has alternatives. Data augmentation is common. Of course, these two are not exclusive. Different approaches can (often) be combined to solve a data problem.

references

[1] https://www.tensorflow.org/tutorials/images/transfer_learning#create_the_base_model_from_the_pre-trained_convnets

[2] https://www.tensorflow.org/api_docs/python/tf/keras/applications/resnet50/ResNet50

[3] https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator

[4] https://www.tensorflow.org/api_docs/python/tf/keras/Model

[5] https://www.kaggle.com/general/74235

[6] https://www.kaggle.com/dansbecker/hot-dog-not-hot-dog

[7] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5583361/

[8] https://arxiv.org/abs/1512.03385

1 – https://www.toptal.com/machine-learning/tensorflow-machine-learning-tutorial

2 – https://github.com/tensorflow/tensorflow

3 – https://netbasequid.com/blog/social-analytics-hotdog/

4 – https://neurohive.io/en/popular-networks/resnet/

5 – https://www.researchgate.net/figure/Left-ResNet50-architecture-Blocks-with-dotted-line-represent-modules-that-might-be_fig3_331364877

6 – https://colab.research.google.com/drive/1pYVZtULa3pKncA7C2umg9LA5tqOCCos3?usp=sharing

7 – https://colab.research.google.com/drive/1pYVZtULa3pKncA7C2umg9LA5tqOCCos3?usp=sharing

The media described in this article is not owned by Analytics Vidhya and is used at the author’s discretion



Source link

Related Articles

Back to top button