#100daysofdata

in #100DaysOfCode, #100DaysOfData, #100DaysOfTensorflow

Transfer learning (and a reflexion about this 100 days challenge)

Transfer learning is a way to reuse already trained models to increase the performance of a new model being trained. Today, we will explore this concept.

BUT FIRST, it’s time for a brief reflexion about this challenge.

When I decided to start the #100DaysOfTensorflow challenge, I had two main goals: to discover features about this ecosystem that I still didn’t know and to not “forget” what I had already learned.

However, I don’t consider the way as the #100DaysOf**Something** optimal. Sometimes, we keep doing things that don’t “connect” just for the sake of doing them.

While I believe this may help us to remember automatic things – such as “Dense layers path is tk.keras.layers.Dense” – I think that analytical thinking requires more work and deeper analysis of problems.

Also, learning Tensorflow is great, but Machine Learning and even data analysis is not about all this tool. There are simpler and faster solutions to solve data related problems.

So, I will be changing the format of this challenge in the following way:

  • Instead of #100DaysOfTensorflow, let’s call this challenge #100DaysOfData
  • I will still continue to code everyday, but I may not publish complete code on a daily bases. In my opinion, rush is the greatest enemy of good analysis. So, instead, I’d rather commenting what I have done for the day.

That being said… During the next days, I will explore Tensorflow data for at least 1 hour per day and post the notebooks, data and models, when they are available, to this repository.

Today’s notebook is available here.

Let’s start!

# do imports
import os
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
import tensorflow_datasets as tfds

Get the examples from the “Cats vs. dogs” dataset.

  • Train: 80%
  • Validation: 10%
  • Test: 10%

The images contain images with different shapes and 3 channels.

(raw_train, raw_validation, raw_test), metadata = tfds.load(
    'cats_vs_dogs',
    split=['train[:80%]', 'train[80%:90%]', 'train[90%:]'],
    with_info=True,
    as_supervised=True,
)

First thing we will do is to resize all images, so they have a 100 x 100 size. Tensorflow official example uses 160 x 160, but I would like to experiment with smaller values to check the impact of this change.

IMG_SIZE = 100 # All images will be resized to 160x160

def format_example(image, label):
  image = tf.cast(image, tf.float32)
  image = (image/127.5) - 1
  image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
  return image, label

# apply to dataset
train = raw_train.map(format_example)
validation = raw_validation.map(format_example)
test = raw_test.map(format_example)

# shuffle the dataset and batch the data
BATCH_SIZE = 32
SHUFFLE_BUFFER_SIZE = 1000

train_batches = train.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE)
validation_batches = validation.batch(BATCH_SIZE)
test_batches = test.batch(BATCH_SIZE)

Create the base model using pre-trained convnets

The base model used here comes from Ternsorflow official examples and uses the MobileNet V2 model developed at Google.

According to them, “this is pre-trained on the ImageNet dataset, a large dataset consisting of 1.4M images and 1000 classes”.

IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)

# Create the base model from the pre-trained model MobileNet V2
base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=False,
                                               weights='imagenet')

We have to “freeze” the convolutional base created before to use it as a feature extractor. Then, we add a classifier on top of it and train the top-level classifier. To freeze the model, we set the trainable flag to “False”.

base_model.trainable = False

# check model
base_model.summary()

To generate predictions, we use GlobalAveragePooling2D layer and a Dense layer to convert features into a single prediction per image.

# add layer to create features
global_average_layer = tf.keras.layers.GlobalAveragePooling2D()

# add a prediction layer
prediction_layer = tf.keras.layers.Dense(1)

# create model
model = tf.keras.Sequential([
  base_model,
  global_average_layer,
  prediction_layer
])

# compile
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.RMSprop(lr=base_learning_rate),
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              metrics=['accuracy'])

# see summary
model.summary()

# train model
initial_epochs = 10

history = model.fit(train_batches,
                    epochs=initial_epochs,
                    validation_data=validation_batches)

Check the loss and the accuracy.

acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

plt.figure(figsize=(8, 8))
plt.subplot(2, 1, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Training and Validation Accuracy')

plt.subplot(2, 1, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.ylabel('Cross Entropy')
plt.ylim([0,1.0])
plt.title('Training and Validation Loss')
plt.xlabel('epoch')
plt.show()
Transfer learning (and a reflexion about this 100 days challenge)


Do you want to connect? It will be a pleasure to discuss Machine Learning with you. Drop me a message on LinkedIn.

Leave a Reply