This article was published as part of the Data Science Blogathon
- What is the Transfer learning and work
- How does transfer learning work?
- Why should you use Transfer Learning?
- When to use transfer learning
- models wIn pre-trained
Reusing a previously learned model in a new problem is known as transfer learning. It is especially popular in deep learning at the moment because it can train deep neural networks with a small amount of data. This is especially valuable in the field of data science, where most real-world situations do not require millions of labeled data points to train complex models.
Table of contents
- What is transfer learning and it works
- How does transfer learning work?
- Why should you use Transfer Learning?
- When to use transfer learning
- Pre-trained models
- Implementing Transfer Learning Code with Python
What is the transfer of learning and work
Reusing a previously trained model in a new problem is known as transfer learning in machine learning. The machine uses the knowledge gained from a previous task to increase the prediction of a new task in transfer of learning. You can, for example, use the information gained during training to distinguish drinks when training the classifier to predict whether an image contains a kitchen.
The knowledge of a machine learning model that has already been trained is transferred to a different but closely related problem during the learning transfer process. For example, if you train the simple classifier to predict whether an image has a backpack, you can use the model’s training knowledge to learn about other things like sunglasses.
With transfer learning, we are essentially trying to use what we learned in one task to better understand concepts in another. The weights are automatically transferred to the Task A network from the new Task B network.
Due to the large amount of CPU power required, transfer learning is typically applied in computer vision and natural language processing tasks such as sentiment analysis.
How does transfer learning work?
In computer vision, neural networks typically aim to detect edges in the first layer, models in the middle layer, and task-specific features in the last layers. The first and central layers are used to transfer learning, and only the last layers are retrained. Makes use of categorized data from the task being trained.
Let’s go back to the example of the model that was designed to identify a backpack in an image and will now be used to reveal sunglasses. Since the model was trained to recognize objects in previous levels, we will simply retrain subsequent layers to understand what distinguishes sunglasses from other objects.
Why should you use Transfer Learning?
Transfer learning offers a number of advantages, the most important of which are reduced training time, improved neural network performance (in most conditions), and the absence of a large amount of data.
To train a neural model from scratch, a lot of data is usually needed, but access to this data is not always possible – that’s when transfer learning comes in handy.
Since the model is already pre-trained, a good machine learning model can be created with little training data using transfer learning. This is especially useful in natural language processing, where large tag datasets require a lot of specialized knowledge. Additionally, training time is reduced because building a deep neural network from the start of a complex task can take days or even weeks.
When to use transfer learning
When we don’t have enough annotated data to train our model. When there is a pre-trained model that has been trained on similar data and tasks. If you use TensorFlow to train the archetype, you can simply restore it and retrain some of the layers for your work. On the other hand, transfer learning only works if the features learned in the first task are generic, which means they can be applied to another activity. Furthermore, the model’s inputs should be the same size as they were when they were first trained. if
If you don’t have it, add a step to change the input size to the desired size.
1. Training a model to reuse it
Consider a situation where you want to tackle task A but lack the data to train a deep neural network. One way to get around this is to find a B task that is relevant to a lot of data.
Use the deep neural network to train on task B and then use the model to solve task A. The problem you are trying to solve will determine whether you need to use the entire model or just a few layers.
If the input in both functions is the same, you can reapply the model and make predictions for your new input. On the other hand, changing and retraining the distinct layers of the task and the output layer is a method of investigation.
2. Use a pre-trained model
The second option is to use a model that has already been trained. There are a number of such models, so do some research beforehand. The number of classes that will be reused and retrained is determined by the task.
Keras consists of nine pre-trained models that are used to impart learning, prediction, and fine-tuning. These templates can be found here, as well as some quick lessons on how to take advantage of them. Many research institutions also provide access to trained models.
The most common application of this type of learning is deep learning.
3. Extract features
Another option is to use deep learning to determine the optimal representation of your problem, which includes identifying key features. This method is known as representational learning, and it can often lead to much better results than hand-built representations.
Features in machine learning are primarily generated manually by researchers and industry professionals. Fortunately, deep learning can extract features automatically. Of course, this does not diminish the importance of feature engineering and domain knowledge; You still have to choose which features you want to include in your network.
On the other hand, neural networks have the ability to tell which features are critical and which are not. Even for complex tasks that may require a lot of human effort, the analog learning algorithm can find a decent mix of characteristics in a short period of time.
The learned representation can then be applied to a variety of other challenges. Just use the raw layers to find the appropriate feature representation, but avoid using the network output because it’s very task specific. Instead, send data to and out of your network through one of the intermediate layers.
The raw data can then be understood as a representation of this layer.
This method is commonly used in computer vision as it can shrink your data set, reduce computation time and make it more suitable for classical algorithms.
A number of popular pre-trained machine learning models are available. The Inception-v3 model, developed for ImageNet’s “Great Visual Recognition Challenge, “He is one of them.” Participants in this challenge had to categorize the images into 1,000 subcategories such as “zebra”, “dalmatian” and “dishwasher”.
Implementing Transfer Learning Code with Python
(Data set is Chest-CT Scan from Kaggle)
import tensorflow as tf import pandas as pd import matplotlib.pyplot as plt from tensorflow.keras import Model from tensorflow.keras.layers import Conv2D, Dense, MaxPooling2D, Dropout, Flatten,GlobalAveragePooling2D from tensorflow.keras.models import Sequential from tensorflow.keras.preprocessing.image import ImageDataGenerator from tensorflow.keras.callbacks import ReduceLROnPlateau from tensorflow.keras.layers import Input, Lambda, Dense, Flatten from tensorflow.keras.models import Model from tensorflow.keras.applications.inception_v3 import InceptionV3 from tensorflow.keras.applications.inception_v3 import preprocess_input from tensorflow.keras.preprocessing import image from tensorflow.keras.preprocessing.image import ImageDataGenerator,load_img from tensorflow.keras.models import Sequential import numpy as np from glob import glob
Loading data via Kaggle API
from google.colab import files
Saving kaggle.json to kaggle.json
!mkdir -p ~/.kaggle !cp kaggle.json ~/.kaggle/ !chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d mohamedhanyyy/chest-ctscan-images #downloading data from kaggle API of Dataset
from zipfile import ZipFile file_name = "chest-ctscan-images.zip" with ZipFile(file_name,'r') as zip: zip.extractall() print('Done')
Designing our CNN model with the help of pre-training model
InceptionV3_model = tf.keras.applications.InceptionV3(weights="imagenet", include_top=False, input_shape=(224, 224, 3))
from tensorflow.keras import Model from tensorflow.keras.layers import Conv2D, Dense, MaxPooling2D, Dropout, Flatten,GlobalAveragePooling2D from tensorflow.keras.models import Sequential # The last 15 layers fine tune for layer in InceptionV3_model.layers[:-15]: layer.trainable = False x = InceptionV3_model.output x = GlobalAveragePooling2D()(x) x = Flatten()(x) x = Dense(units=512, activation='relu')(x) x = Dropout(0.3)(x) x = Dense(units=512, activation='relu')(x) x = Dropout(0.3)(x) output = Dense(units=4, activation='softmax')(x) model = Model(InceptionV3_model.input, output) model.summary()
Zoom in (to prevent over-allocation issue)
# Use the Image Data Generator to import the images from the dataset from tensorflow.keras.preprocessing.image import ImageDataGenerator train_datagen = ImageDataGenerator(rescale = 1./255, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = True) test_datagen = ImageDataGenerator(rescale = 1./255) #no flip and zoom for test datase
# Make sure you provide the same target size as initialied for the image size training_set = train_datagen.flow_from_directory('/content/Data/train', target_size = (224, 224), batch_size = 32, class_mode="categorical")
training our model
# fit the form # run the cell. It will take some time to execute r = model.fit_generator(training_set, validation_data = test_set, epochs = 8, steps_per_epoch = len(training_set), validation_steps = len(test_set))
# plot the loss plt.plot(r.history['loss'], label="train loss") plt.plot(r.history['val_loss'], label="val loss") plt.legend() plt.show() plt.savefig('LossVal_loss') # plot the accuracy plt.plot(r.history['accuracy'], label="train acc") plt.plot(r.history['val_accuracy'], label="val acc") plt.legend() plt.show() plt.savefig('AccVal_acc')
import numpy as np y_pred = np.argmax(y_pred, axis=1) y_pred
The above code is executed and the output for the classification with Transfer Learning is displayed under the embedded notebook:
You can access the Google Colab notebook’s Github link here
With this, I end this blog.
Hello everyone, Namaste
My name is Prancho Sharma and I’m passionate about data science
Thank you very much for your valuable time to read this blog. Feel free to point out any bugs (I’m a learner after all) and provide feedback or leave a comment.