1. Load the MNIST dataset and create a CNN model

  • load the MNIST dataset from the tensorflow/keras built-in dataset (just like last time)
  • use the original train/test split!
  • divide each pixel's value by 255 and now do not reshape, leave it as is (2D matrix (28x28) )
  • eg for the test set you will have a (10000, 28, 28) shaped vector
  • train the following network on the training set and generate prediction for the 10.000 test images:

      input (28, 28)
      conv2D, 16 kernels, kernel size = 3, valid padding, relu activation
      conv2D, 16 kernels, kernel size = 3, valid padding, relu activation
      maxpooling kernel size = 2*2
      conv2D, 32 kernels, kernel size = 3, valid padding, relu activation
      conv2D, 32 kernels, kernel size = 3, valid padding, relu activation
      maxpooling kernel size = 2*2
      dense, 10 neurons, softmax activation
    • pay attention to channel format, you will need to expand dims!
    • how many parameters do we have for each layer?
    • use Adam optimizer with default parameters
    • use categorical crossentropy as loss function
    • compile the model
    • print out a summary of the model
    • train the CNN on the training data for 5 epochs with batch size of 32
    • use the test data as validation data
  • calculate the categorical cross-entropy loss and the accuracy! Hint: you should get at least ~98% accuracy

  • show the confusion matrix of the predictions (predicted values vs actual labels)
  • where does the model make mistakes? Where does it improve compared to fully connected nets?

2. Download the Street View House Numbers (SVHN) Dataset

  • source: http://ufldl.stanford.edu/housenumbers/
  • use the cropped dataset!
  • to get the dataset use eg. wget and keep the original splitting, so download train and test matrix files
  • preprocess the downloaded data to be able to use it for training and testing, so shapes should be same (except image sizes) as it was in ex 1.
  • how many classes do we have in the dataset? how many train and test examples do we have?
  • what is the dimension of the images?
  • show 5 images from the dataset
  • make one-hot encoding for the labels

3. Train the CNN model seen in the 1st exercise for this dataset

  • create a convolutional neural network
  • the network should have the following layers:

      input (32, 32, 3)
      conv2D, 16 kernels, kernel size = 3, valid padding, relu actvation
      conv2D, 16 kernels, kernel size = 3, valid padding, relu actvation
      maxpooling kernel size = 2*2
      conv2D, 32 kernels, kernel size = 3, valid padding, relu actvation
      conv2D, 32 kernels, kernel size = 3, valid padding, relu actvation
      maxpooling kernel size = 2*2
      dense, 10 neurons, softmax activation
      how many parameters do we have for each layer?
    • use Adam optimizer with default parameters
    • use categorical crossentropy as loss function
    • compile the model
    • print out a summary of the model
    • train the CNN on the training data for 15 epochs with batch size of 32
    • use the test data as validation data
  • calculate the categorical cross-entropy loss and the accuracy! Hint: you should get at least ~80-90% accuracy

4. Evaluate performance

  • plot the training and the validation loss on the same plot!
  • plot the training and the validation accuracy on the same plot!
  • do we overfit?
  • show the confusion matrix of the predictions (predicted values vs actual labels)
  • where does the model make mistakes?

5. Train an other CNN

  • as we can see the previous architecture can be further improved
  • come up with an architecture that can achieve more than 91% accuracy on the test set
  • print out the summary for this model!
  • plot the loss and accuracy curves for this model too!
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.layers import Dense, Conv2D, MaxPooling2D, \
                                    BatchNormalization, Dropout, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.datasets import mnist
from tensorflow.keras import utils
from tensorflow.keras import callbacks
from sklearn.metrics import confusion_matrix
import seaborn as sns
from scipy.io import loadmat


(x_train, y_train), (x_test, y_test) = mnist.load_data()

# normalize inputs from 0-255 to 0.0-1.0
x_train = np.expand_dims( x_train.astype('float32'), -1)
x_test = np.expand_dims( x_test.astype('float32'), -1 )
x_train = x_train / 255.0
x_test = x_test / 255.0

# one hot encode outputs
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)
num_classes = y_test.shape[1]

x_train.shape, x_test.shape, y_train.shape, y_test.shape, num_classes
[0 1 2 3 4 5 6 7 8 9]
((60000, 28, 28, 1), (10000, 28, 28, 1), (60000, 10), (10000, 10), 10)
plt.imshow(x_train[42, ..., 0]) 
<matplotlib.image.AxesImage at 0x7f235180eba8>
model = Sequential()
model.add( Conv2D(16, (3, 3), input_shape=(28, 28, 1), padding='valid', activation='relu' ) )
model.add( Conv2D(16, (3, 3), activation='relu', padding='valid' ) )
model.add( MaxPooling2D((2, 2)) )
model.add( Conv2D(32, (3, 3), activation='relu', padding='valid' ) )
model.add( Conv2D(32, (3, 3), activation='relu', padding='valid' ) )
model.add( MaxPooling2D((2, 2)) )
model.add (Flatten() )
model.add( Dense(num_classes, activation='softmax') )

# Compile model
epochs = 5
model.compile( loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'] )
Model: "sequential"
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 26, 26, 16)        160       
conv2d_1 (Conv2D)            (None, 24, 24, 16)        2320      
max_pooling2d (MaxPooling2D) (None, 12, 12, 16)        0         
conv2d_2 (Conv2D)            (None, 10, 10, 32)        4640      
conv2d_3 (Conv2D)            (None, 8, 8, 32)          9248      
max_pooling2d_1 (MaxPooling2 (None, 4, 4, 32)          0         
flatten (Flatten)            (None, 512)               0         
dense (Dense)                (None, 10)                5130      
Total params: 21,498
Trainable params: 21,498
Non-trainable params: 0
# Fit the model
history = model.fit( x_train, y_train, validation_data=(x_test, y_test), epochs=5, batch_size=32 )
Epoch 1/5
1875/1875 [==============================] - 11s 6ms/step - loss: 0.1609 - accuracy: 0.9506 - val_loss: 0.0493 - val_accuracy: 0.9852
Epoch 2/5
1875/1875 [==============================] - 10s 6ms/step - loss: 0.0515 - accuracy: 0.9847 - val_loss: 0.0381 - val_accuracy: 0.9879
Epoch 3/5
1875/1875 [==============================] - 10s 6ms/step - loss: 0.0372 - accuracy: 0.9881 - val_loss: 0.0319 - val_accuracy: 0.9891
Epoch 4/5
1875/1875 [==============================] - 10s 6ms/step - loss: 0.0287 - accuracy: 0.9912 - val_loss: 0.0383 - val_accuracy: 0.9874
Epoch 5/5
1875/1875 [==============================] - 10s 6ms/step - loss: 0.0226 - accuracy: 0.9926 - val_loss: 0.0257 - val_accuracy: 0.9917
# summarize history for accuracy
plt.figure( figsize=(6,6), dpi=100 )
plt.plot( np.arange(epochs)+1, history.history['loss'] )
plt.plot( np.arange(epochs)+1, history.history['val_loss'] )
plt.title('Model loss function')
plt.legend(['train', 'test'], loc='upper right')
<matplotlib.legend.Legend at 0x7f234c168470>
y_pred = model.predict( x_test, verbose=0)
# Final evaluation of the model
scores = model.evaluate( x_test, y_test, verbose=0 )
print("Accuracy: %.2f%%" % ( scores[1]*100) )
Accuracy: 99.17%
y_test_labels = np.array( [ np.argmax(y_test[i]) for i in np.arange(y_test.shape[0]) ] )
y_pred_labels = np.array( [ np.argmax(y_pred[i]) for i in np.arange(y_pred.shape[0]) ] )
y_test_labels.shape, y_pred_labels.shape
((10000,), (10000,))
matrix = confusion_matrix(y_test_labels, y_pred_labels, labels=np.arange(0, 10))

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(matrix, annot=True, cmap='Greens', fmt='d', ax=ax)
plt.title('Confusion Matrix for training dataset')
plt.xlabel('Predicted label')
plt.ylabel('True label')
Text(51.0, 0.5, 'True label')


# Load the data

train_raw = loadmat('train_32x32.mat')
test_raw = loadmat('test_32x32.mat')

# Load images and labels

x_train = np.array(train_raw['X'])
x_test = np.array(test_raw['X'])

y_train = train_raw['y']
y_test = test_raw['y']

# Check the shape of the data


# Fix the axes of the images

x_train = np.moveaxis(x_train, -1, 0)
x_test = np.moveaxis(x_test, -1, 0)

(32, 32, 3, 73257)
(32, 32, 3, 26032)
(73257, 32, 32, 3)
(26032, 32, 32, 3)
(73257, 1)
(26032, 1)
print(np.unique(y_train)), print(np.unique(y_test))
[ 1  2  3  4  5  6  7  8  9 10]
[ 1  2  3  4  5  6  7  8  9 10]
(None, None)
# normalize inputs from 0-255 to 0.0-1.0
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train = x_train / 255.0
x_test = x_test / 255.0

# one hot encode outputs
y_train[ y_train == 10] = 0 # 10 is 0 here
y_test[ y_test == 10] = 0 # 10 is 0 here
y_train = utils.to_categorical(y_train)
y_test = utils.to_categorical(y_test)
num_classes = y_test.shape[1]

x_train.shape, x_test.shape, y_train.shape, y_test.shape, num_classes
((73257, 32, 32, 3), (26032, 32, 32, 3), (73257, 10), (26032, 10), 10)
# Plot a random image and its label


print('Label OH vector:', y_train[73239], '\nLabel number:', np.argmax(y_train[73239]))
Label OH vector: [1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] 
Label number: 0


model = Sequential()
model.add( Conv2D(16, (3, 3), input_shape=(32, 32, 3), padding='valid', activation='relu' ) )
model.add( Conv2D(16, (3, 3), activation='relu', padding='valid' ) )
model.add( MaxPooling2D((2, 2)) )
model.add( Conv2D(32, (3, 3), activation='relu', padding='valid' ) )
model.add( Conv2D(32, (3, 3), activation='relu', padding='valid' ) )
model.add( MaxPooling2D((2, 2)) )
model.add (Flatten() )
model.add( Dense(num_classes, activation='softmax') )

# Compile model
epochs = 15
model.compile( loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'] )
Model: "sequential_5"
Layer (type)                 Output Shape              Param #   
conv2d_26 (Conv2D)           (None, 30, 30, 16)        448       
conv2d_27 (Conv2D)           (None, 28, 28, 16)        2320      
max_pooling2d_13 (MaxPooling (None, 14, 14, 16)        0         
conv2d_28 (Conv2D)           (None, 12, 12, 32)        4640      
conv2d_29 (Conv2D)           (None, 10, 10, 32)        9248      
max_pooling2d_14 (MaxPooling (None, 5, 5, 32)          0         
flatten_5 (Flatten)          (None, 800)               0         
dense_8 (Dense)              (None, 10)                8010      
Total params: 24,666
Trainable params: 24,666
Non-trainable params: 0
# Fit the model
history = model.fit( x_train, y_train, validation_data=(x_test, y_test), 
                     epochs=epochs, batch_size=32 )
Epoch 1/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.8316 - accuracy: 0.7434 - val_loss: 0.5799 - val_accuracy: 0.8350
Epoch 2/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.4605 - accuracy: 0.8663 - val_loss: 0.4786 - val_accuracy: 0.8622
Epoch 3/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.3944 - accuracy: 0.8860 - val_loss: 0.4363 - val_accuracy: 0.8734
Epoch 4/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.3570 - accuracy: 0.8952 - val_loss: 0.4343 - val_accuracy: 0.8750
Epoch 5/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.3318 - accuracy: 0.9026 - val_loss: 0.3996 - val_accuracy: 0.8831
Epoch 6/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.3140 - accuracy: 0.9072 - val_loss: 0.3803 - val_accuracy: 0.8898
Epoch 7/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.3007 - accuracy: 0.9115 - val_loss: 0.4058 - val_accuracy: 0.8834
Epoch 8/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2863 - accuracy: 0.9160 - val_loss: 0.3690 - val_accuracy: 0.8934
Epoch 9/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2745 - accuracy: 0.9180 - val_loss: 0.3699 - val_accuracy: 0.8948
Epoch 10/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2644 - accuracy: 0.9219 - val_loss: 0.3875 - val_accuracy: 0.8903
Epoch 11/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2564 - accuracy: 0.9242 - val_loss: 0.3664 - val_accuracy: 0.8967
Epoch 12/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2483 - accuracy: 0.9268 - val_loss: 0.3779 - val_accuracy: 0.8930
Epoch 13/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2427 - accuracy: 0.9287 - val_loss: 0.3803 - val_accuracy: 0.8931
Epoch 14/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2362 - accuracy: 0.9292 - val_loss: 0.3844 - val_accuracy: 0.8955
Epoch 15/15
2290/2290 [==============================] - 17s 7ms/step - loss: 0.2294 - accuracy: 0.9320 - val_loss: 0.3766 - val_accuracy: 0.8981


# summarize history for loss
plt.figure( figsize=(6,6), dpi=100 )
plt.plot( np.arange(epochs)+1, history.history['loss'] )
plt.plot( np.arange(epochs)+1, history.history['val_loss'] )
plt.title('Model loss function')
plt.legend(['train', 'test'], loc='upper right')
<matplotlib.legend.Legend at 0x7f2339923b00>
# summarize history for accuracy
plt.figure( figsize=(6,6), dpi=100 )
plt.plot( np.arange(epochs)+1, history.history['accuracy'] )
plt.plot( np.arange(epochs)+1, history.history['val_accuracy'] )
plt.title('Model accuracy')
plt.legend(['train', 'test'], loc='upper left')
<matplotlib.legend.Legend at 0x7f23399a3400>
y_pred = model.predict( x_test, verbose=0)
# Final evaluation of the model
scores = model.evaluate( x_test, y_test, verbose=0 )
print("Accuracy: %.2f%%" % ( scores[1]*100) )
Accuracy: 89.81%
y_test_labels = np.array( [ np.argmax(y_test[i]) for i in np.arange(y_test.shape[0]) ] )
y_pred_labels = np.array( [ np.argmax(y_pred[i]) for i in np.arange(y_pred.shape[0]) ] )
y_test_labels.shape, y_pred_labels.shape
((26032,), (26032,))
matrix = confusion_matrix(y_test_labels, y_pred_labels, labels=np.arange(0, 10))

fig, ax = plt.subplots(figsize=(8, 6))
sns.heatmap(matrix, annot=True, cmap='Greens', fmt='d', ax=ax)
plt.title('Confusion Matrix for training dataset')
plt.xlabel('Predicted label')
plt.ylabel('True label')
Text(51.0, 0.5, 'True label')


model = Sequential([
    Conv2D(32, (3, 3), padding='same', 
                   input_shape=(32, 32, 3)),
    Conv2D(32, (3, 3), padding='same', 
    MaxPooling2D((2, 2)),

    Conv2D(64, (3, 3), padding='same', 
    Conv2D(64, (3, 3), padding='same',
    MaxPooling2D((2, 2)),

    Conv2D(128, (3, 3), padding='same', 
    Conv2D(128, (3, 3), padding='same',
    MaxPooling2D((2, 2)),

    Dense(128, activation='relu'),
    Dense(10,  activation='softmax')

early_stopping = callbacks.EarlyStopping(patience=8)
optimizer = Adam(lr=1e-3, amsgrad=True)
model_checkpoint = callbacks.ModelCheckpoint(
Model: "sequential_4"
Layer (type)                 Output Shape              Param #   
conv2d_20 (Conv2D)           (None, 32, 32, 32)        896       
batch_normalization_6 (Batch (None, 32, 32, 32)        128       
conv2d_21 (Conv2D)           (None, 32, 32, 32)        9248      
max_pooling2d_10 (MaxPooling (None, 16, 16, 32)        0         
dropout_8 (Dropout)          (None, 16, 16, 32)        0         
conv2d_22 (Conv2D)           (None, 16, 16, 64)        18496     
batch_normalization_7 (Batch (None, 16, 16, 64)        256       
conv2d_23 (Conv2D)           (None, 16, 16, 64)        36928     
max_pooling2d_11 (MaxPooling (None, 8, 8, 64)          0         
dropout_9 (Dropout)          (None, 8, 8, 64)          0         
conv2d_24 (Conv2D)           (None, 8, 8, 128)         73856     
batch_normalization_8 (Batch (None, 8, 8, 128)         512       
conv2d_25 (Conv2D)           (None, 8, 8, 128)         147584    
max_pooling2d_12 (MaxPooling (None, 4, 4, 128)         0         
dropout_10 (Dropout)         (None, 4, 4, 128)         0         
flatten_4 (Flatten)          (None, 2048)              0         
dense_6 (Dense)              (None, 128)               262272    
dropout_11 (Dropout)         (None, 128)               0         
dense_7 (Dense)              (None, 10)                1290      
Total params: 551,466
Trainable params: 551,018
Non-trainable params: 448
# Fit the model
history = model.fit( x_train, y_train, validation_data=(x_test, y_test), 
                     epochs=15, batch_size=32, 
                     callbacks=[early_stopping, model_checkpoint])
Epoch 1/15
2290/2290 [==============================] - 46s 20ms/step - loss: 1.9731 - accuracy: 0.2804 - val_loss: 1.4497 - val_accuracy: 0.5539
Epoch 2/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.9340 - accuracy: 0.6889 - val_loss: 0.4537 - val_accuracy: 0.8665
Epoch 3/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.4595 - accuracy: 0.8681 - val_loss: 0.3838 - val_accuracy: 0.8914
Epoch 4/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.3732 - accuracy: 0.8930 - val_loss: 0.2612 - val_accuracy: 0.9300
Epoch 5/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.3272 - accuracy: 0.9075 - val_loss: 0.3148 - val_accuracy: 0.9122
Epoch 6/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2977 - accuracy: 0.9168 - val_loss: 0.3140 - val_accuracy: 0.9125
Epoch 7/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2781 - accuracy: 0.9209 - val_loss: 0.2378 - val_accuracy: 0.9388
Epoch 8/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2580 - accuracy: 0.9268 - val_loss: 0.2243 - val_accuracy: 0.9424
Epoch 9/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2441 - accuracy: 0.9309 - val_loss: 0.2191 - val_accuracy: 0.9440
Epoch 10/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2314 - accuracy: 0.9341 - val_loss: 0.2066 - val_accuracy: 0.9496
Epoch 11/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2263 - accuracy: 0.9358 - val_loss: 0.2150 - val_accuracy: 0.9435
Epoch 12/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2153 - accuracy: 0.9387 - val_loss: 0.2115 - val_accuracy: 0.9469
Epoch 13/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.2087 - accuracy: 0.9401 - val_loss: 0.1928 - val_accuracy: 0.9524
Epoch 14/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.1978 - accuracy: 0.9437 - val_loss: 0.2240 - val_accuracy: 0.9473
Epoch 15/15
2290/2290 [==============================] - 46s 20ms/step - loss: 0.1918 - accuracy: 0.9445 - val_loss: 0.2110 - val_accuracy: 0.9506
# Evaluate train and validation accuracies and losses

train_acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

train_loss = history.history['loss']
val_loss = history.history['val_loss']
# Visualize epochs vs. train and validation accuracies and losses

plt.figure(figsize=(20, 10))

plt.subplot(1, 2, 1)
plt.plot(train_acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Epochs vs. Training and Validation Accuracy')
plt.subplot(1, 2, 2)
plt.plot(train_loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Epochs vs. Training and Validation Loss')
Text(0.5, 1.0, 'Epochs vs. Training and Validation Loss')
# Evaluate model on test data
test_loss, test_acc = model.evaluate(x=x_test, y=y_test, verbose=0)

print('Test accuracy is: {:0.4f} \nTest loss is: {:0.4f}'.
      format(test_acc, test_loss))
Test accuracy is: 0.9506 
Test loss is: 0.2110