In this notebook, we will continue on our Face Recognition with SVM notebook and replicate the work has been done using the Google's TensorFlow 2.0 library. We will create a Convolutional Neural Network model for face recognition, train it on the same data we used earlier and test it against the test set.

*If you don't have a decent hardware, you can run this notebook inside the Google Colab.*

When running in the Colab, we need to switch to Tensorflow 2.0+. This can be done easily using the magic function: `%tensorflow_version 2.x`

.

If you run this code on your local machine, you can skip or remove the following cell.

In [1]:

```
import sys
IN_COLAB = 'google.colab' in sys.modules
if IN_COLAB:
%tensorflow_version 2.x
import tensorflow as tf
print(f"Running in Colab with Tensorflow version: {tf.__version__}")
```

Let's import the face dataset we previously used using the scikit-learn. This is exactly the same and nothing changed here.

In [ ]:

```
from sklearn import datasets
data = datasets.fetch_olivetti_faces()
```

Then, we need to prepare our data for deep learning model.
Colored images,most of the time, are represented with 3 different matrixes consist of different colours/channel information. These colours are **Red**, **Green**, and **Blue**, or `RGB`

in short. Therefore, an image with the size of `128 x 128`

can be represented as a `128x128x3`

, or `3x128x128`

. The number of channels/colours can be represented either as the first or last dimension.

Since all the face images are grayscale, we only have one channel/colour which is black. The numbers inside this matrix indicates how bright or how dark is each pixel. The closer to 1 the brigher the pixel, and the closer to 0, the darker the pixel.

In [ ]:

```
import tensorflow as tf
import numpy as np
import cv2
# rename dataset for easy access
X = data["images"]
y = data["target"]
num_class = len(set(y)) # number of different people in the dataset
X = np.expand_dims(X, -1) # add an axis for channel information
```

We can identify people inside this data with the `y`

variable. `y`

is an ordinally encoded variable which every people is represented with a unique number starting from 0.

For this deep learning model, we need to convert these `y`

variable into a vector which will be unique for each person. This vector will be made of zeros and only one `1`

value, representing the class/person. This is called **One-hot Encoding**.

Suppose that we have 3 people in this dataset and we represent those people as `1`

, `2`

, and `3`

. When converting these into **one-hot encoded vectors** 1 becomes `[1 0 0]`

, 2 becomes `[0 1 0]`

, and 3 becomes `[0 0 1]`

. The order of the number becomes 1 while rest of the vector is filled with zero.

How can we convert our `y`

variable to one-hot encoded vectors? Well, we can use Tensorflow's `one_hot`

function as follows:

In [ ]:

```
y = tf.one_hot(y, depth=num_class).numpy() # convert y to one hot vectors
```

In order to have reproducible results, we can fix the seed value for random number generators of both numpy and tensorflow libraries.

In [ ]:

```
np.random.seed(1)
tf.random.set_seed(2)
```

Then we can split our data and labels for training and testing. This is exactly the same procedure as the previous SVM notebook.

In [ ]:

```
import numpy as np
from sklearn.model_selection import StratifiedShuffleSplit
# split data randomly into train & test sets by preserving train/test ratio across classes
sss = StratifiedShuffleSplit(n_splits=1, test_size=0.1, random_state=0)
# get the train and test indexes
train_index, test_index = next(sss.split(X, y))
# split X and y into train & test sets
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = y[train_index], y[test_index]
```

When training our deep learning model, we can see how it performs during the training. Since the training will be used to create deep learning model and test set will be used to calculate how the final model performs, we need additional set which will not be a part of either training or test sets. That dataset is called as **validation set**. We can follow above procedure to split training set into training and validation set.

In [7]:

```
# split training set into training and validation sets
sss = StratifiedShuffleSplit(n_splits=1, test_size=0.15, random_state=0)
train_index, val_index = next(sss.split(X_train, y_train))
X_train, X_val = X[train_index], X[val_index]
y_train, y_val = y[train_index], y[val_index]
# Print statistics about it
print(f"Train data size: {len(y_train)}")
print(f"Validation data size: {len(y_val)}")
print(f"Test data size : {len(y_test)}")
```

We have our data ready, let's create a deep learning model. To define a model, we'll use the **Keras** library comes with TensorFlow as it provides easy-to-use API for defining deep learning models.

We'll use several deep learning layers to define a convolutional neural network. Let's peek Keras' documentation to find out what these layers do.

A densely-connected NN layer. All the neurons/elements are connected to all the neurons/elements in the previous and next layer.

Drops the connection randomly in order to prevent memorizing/over-fitting the data.

Learns spatially-correlated features. For low level features, it can be edges, corners, etc. For high level features, it can be eyes, mouth or nose of a human.

*Ref: Narges Khatami, Wikipedia*

Combines several neurons into one neuron and reduces the dimension. There are several types of pooling layers: Max pooling selects the neuron with the maximum value, Average pooling calculates the average value of all neurons, Min Pooling selects the the neuron with the minimum value. You can find a sample MaxPooling operation below:

*Ref: Aphex34, Wikipedia*

Gather all the matrix/tensor elements into a vector.

In [ ]:

```
from tensorflow.keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten
from tensorflow.keras import Sequential
model = Sequential()
model.add(Conv2D(16, (2, 2), activation='relu', input_shape=X_train[0].shape))
model.add(Conv2D(16, (2, 2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(32, (2, 2), activation='relu'))
model.add(Conv2D(16, (2, 2), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
```

We have the model and we need an output from that model. For output, we can use *Dense* layer with a `softmax`

activation. The length of output vector will be the same as one hot vectors. Then to have a outline of how model looks like, we can use `model.summary()`

and print the overall structure of CNN model.

In [9]:

```
model.add(Dense(num_class, activation='softmax'))
model.summary()
```

Since we have model ready, we need to find out how can we train that model. Unlike the previous machine learning notebook, we need to define every parameter and option when training deep learning models.

To train a model, we need an `optimizer`

, and a `cost`

/`lost`

function. Cost function defines how bad the model works. It gets smaller as it makes better predictions. Our aim is to reduce the cost and that's where `optimizer`

help us.

There are different cost functions for different purposes. In this face recognition problem, we try to tackle this as a classification problem and we have more than 2 classes. In that multi-class classification cases, we can use **categorical cross entropy** as our cost function.

Like cost functions, there are many optimizers, too. We'll use the `Adam`

optimizer for training our model. We also define `metrics`

to see the performance of our model. To see how accurate our predictions are, we cam pass `accuracy`

as a metric.

We'll put all these together into `model.compile`

function where cost function is indicated as `loss`

and the rest remains the same.

In [ ]:

```
model.compile(optimizer="RMSProp", loss="categorical_crossentropy", metrics=['accuracy'])
```

Let's train the model and see how that works. To do that, we can call `model.fit`

function with train data and labels, namely `X_train`

and `y_train`

, respectively. The training data will be iterated over and over again for `epochs`

times where we call each run as epoch.

The model will not iterate all of the training data at once. Therefore it will be feeded in smaller chunks, defined as `batch_size`

. In order to see how the training phase goes, we can specify validation data we prepared above as `validation_data`

parameter to `model.fit`

function. Calling this function will take time depending on how powerful your machine is. If you're running this on Colab, you can leverage hardware acceleration for faster training times from the `Runtime`

- `Change Runtime`

menu. Select either GPU or TPU enabled runtime and run all the cells again.

In [ ]:

```
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_val, y_val));
```

Now we have trained our model. We know how it behaves on training and validation data. At that point, we have a ready-to-use model. We can deploy this model to a real world application. Before that, we need to test how it performs on the test set. The following procedure will predict the test set classes (person in the dataset). Then compares predictions with the ground truth values. After that we have outputs for each metric we compiled the model with. (Note that, loss comes as a built-in metric)

To see accuracy results, we can call `model.evaluate`

function and provide test data and ground truth labels, namely `y_test`

. Then we can print the accuracy as percentage by multiplyting the `accuracy`

value by 100.

In [12]:

```
loss, accuracy = model.evaluate(X_test, y_test, verbose=False)
print(f"Accuracy: {accuracy*100:.2f}%")
```