Please work on Google Colab.
1. Load the MNIST dataset and create a CNN model¶
- load the MNIST dataset from the tensorflow/keras built-in dataset (just like last time)
- use the original train/test split!
- divide each pixel's value by 255 and now do not reshape, leave it as is (2D matrix (28x28) )
- eg for the test set you will have a (10000, 28, 28) shaped vector
train the following network on the training set and generate prediction for the 10.000 test images:
input (28, 28)
conv2D, 16 kernels, kernel size = 3, valid padding, relu activation
conv2D, 16 kernels, kernel size = 3, valid padding, relu activation
maxpooling kernel size = 2*2
conv2D, 32 kernels, kernel size = 3, valid padding, relu activation
conv2D, 32 kernels, kernel size = 3, valid padding, relu activation
maxpooling kernel size = 2*2
flatten
dense, 10 neurons, softmax activation
- pay attention to channel format, you will need to expand dims!
- how many parameters do we have for each layer?
- use Adam optimizer with default parameters
- use categorical crossentropy as loss function
- compile the model
- print out a summary of the model
- train the CNN on the training data for 5 epochs with batch size of 32
- use the test data as validation data
calculate the categorical cross-entropy loss and the accuracy! Hint: you should get at least ~98% accuracy
- show the confusion matrix of the predictions (predicted values vs actual labels)
- where does the model make mistakes? Where does it improve compared to fully connected nets?
2. Download the Street View House Numbers (SVHN) Dataset¶
- source: http://ufldl.stanford.edu/housenumbers/
- use the cropped dataset!
- to get the dataset use eg. wget and keep the original splitting, so download train and test matrix files
- preprocess the downloaded data to be able to use it for training and testing, so shapes should be same (except image sizes) as it was in ex 1.
- how many classes do we have in the dataset? how many train and test examples do we have?
- what is the dimension of the images?
- show 5 images from the dataset
- make one-hot encoding for the labels
3. Train the CNN model seen in the 1st exercise for this dataset¶
- create a convolutional neural network
the network should have the following layers:
input (32, 32, 3)
conv2D, 16 kernels, kernel size = 3, valid padding, relu actvation
conv2D, 16 kernels, kernel size = 3, valid padding, relu actvation
maxpooling kernel size = 2*2
conv2D, 32 kernels, kernel size = 3, valid padding, relu actvation
conv2D, 32 kernels, kernel size = 3, valid padding, relu actvation
maxpooling kernel size = 2*2
flatten
dense, 10 neurons, softmax activation
how many parameters do we have for each layer?
- use Adam optimizer with default parameters
- use categorical crossentropy as loss function
- compile the model
- print out a summary of the model
- train the CNN on the training data for 15 epochs with batch size of 32
- use the test data as validation data
- calculate the categorical cross-entropy loss and the accuracy! Hint: you should get at least ~80-90% accuracy
- plot the training and the validation loss on the same plot!
- plot the training and the validation accuracy on the same plot!
- do we overfit?
- show the confusion matrix of the predictions (predicted values vs actual labels)
- where does the model make mistakes?
5. Train an other CNN¶
- as we can see the previous architecture can be further improved
- come up with an architecture that can achieve more than 91% accuracy on the test set
- print out the summary for this model!
- plot the loss and accuracy curves for this model too!