“Building a Deep Learning Model- Recognition of handwritten digits”

8 min readJun 6, 2021

Deep learning is an increasingly popular subset of machine learning. Deep learning models are built using neural networks.

Keras is a user-friendly neural network library written in Python.In this tutorial,We are going to built a project which will recognize the handwritten digits

Machine Learning has basic 4 steps for all Machine learning programs

1)Import necessary libraries

2)Data set

3)Training our algorithm

4)Testing our algorithm

We are going to do these steps in our project

First-of-all, What is Handwritten Digit Recognition?

Handwritten Digit Recognition is the project in which we are going to predict the handwritten digits using our deep learning network using MNIST

The MNIST dataset

MNIST is ‘Modified National Institute of Standards and Technology. This dataset consists of handwritten digits from 0 to 9 and it provides a pavement for testing image processing systems. This is considered to be the ‘hello world program in Machine Learning’ which involves Deep Learning. MNIST contains 70,000 images of handwritten digits: 60,000 for training and 10,000 for testing. The images are grayscale, 28x28 pixels, and centered to reduce preprocessing and get started quicker. Keras is a high-level neural network API focused on user friendliness, fast prototyping, modularity and extensibility.

Ok,Let’s start our program….

Step 1 : Import necessary libraries

First , We need to import _future_ python module

This statements tell the interpreter to compile some semantics as the semantics which will be available in the future Python version. In other words, Python uses from future import feature to backport features from other higher Python versions to the current interpreter.

If we would have not used the future module both print statements would have printed 1 in Python 2 version.

In Python 3 print() is a function so it does not need to include future module without including the future module it will give the desired output.

When you include future module, you can slowly be habitual to incompatible changes or to such ones introducing new keywords and operators. Python does not allow anyone to implement new operators or keywords except future module.

Then , We need to import numpy.random.seed()

Using numpy.random.seed(number) has been a best practice when using NumPy to create reproducible work. Setting the random seed means that your work is reproducible to others who use your code.

Then,we are going to import all the modules that we are going to need for training our model. The Keras library already contains some datasets and MNIST is one of them. So we can easily import the dataset and start working with it.

Why we used SEQUENTIAL here?

There are two ways to build Keras models: sequential and functional. The sequential API allows us to create models layer-by-layer for most problems. It is very straightforward (a simple list of layers), but is limited to single-input, single-output stacks of layers.It does not allow us to create models that share layers or have multiple inputs or outputs.

What are the dense layers?

Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer. Dense layer does the below operation on the input and return the output.

What is Optimizers?

Optimizers are algorithms or methods used to change the attributes of the neural network such as weights and learning rate to reduce the losses. Optimizers are used to solve optimization problems by minimizing the function.Here, We used SGD(Stochastic Gradient Descent)

We have already discussed about MNIST above…It contains 70000 images of handwritten images.Out of it,We use 60000 images as training set and 10000 as testset

Next,Let’s declare the num_classes,batch_size,epochs

num_classes

num_classes is total number of classes in the dataset. Here , results can be from 0 to 9,s num_classes = 10 . We cannot change the num_classes from 10. If None, this would be inferred as the (largest number in y) + 1.

batch_size

It is defined as number of training examples utilized in one iteration. In this case,We train the first 128 data in the first iteration and in the next round ,We will take next 128 and it goes on till it completes

epochs

An epoch is a full iteration over samples. The number of epochs is how many times the algorithm is going to run. Epoch is an approach by which we pass the same dataset multiple times to the network in order to find optimal weights

STEP 2: Load dataset

We are loading data in our assigned variables from MINST
we splitting the dataset as training and testing set X-train and Y-train are variables that are going to be trained X-test and Y-test are the variables which are not going to be used in training phase They are used for testing algorithm

The image data cannot be fed directly into the model so we need to perform some operations and process the data to make it ready for our neural network.

We need to reshape the size of datasets

You may ask why need to?Lemme answer you.We have to reshape in such a way that we have we can access every pixel of the image. The reason to access every pixel is that only then we can apply deep learning ideas and can assign color code to every pixel. Then we store the reshaped array in X_train, X_test respectively.

Why are we changing the datatype of the dataset to float32?

It is most common to use 32-bit precision when training a neural network, so at one point the training data will have to be converted to 32 bit floats. Since the dataset fits easily in RAM, we might as well convert to float immediately.

“x_train /= 255” — -> why are we including this line ?

This is the maximum value of a byte (the input feature’s type before the conversion to float32), so this will ensure that the input features are scaled between 0.0 and 1.0. For example, suppose the scale is 100 times the scale you are used to, then the learning rate should be 100 times smaller than you are used to, and the loss will be larger than usual (if it is the mean squared error, the loss will likely be 10⁰²=10000 times larger than you are used to)

Is all over? Let’s start training for our algorithm…No guys! There is only one single step to get completed

We need to change the class vectors to binary class matrices

Using the method to_categorical() , a numpy array (or) a vector which has integers that represent different categories, can be converted into a numpy array (or) a matrix which has binary values and has columns equal to the number of categories in the data.

Next,We need to declare the model type

What’s this line mean?

As we already discussed,It is a sequential model.Hence we declare that.

We have 2 hidden layers(first two layers) and an output layer (last layer) ,so it is 3 layers

Here 512,is number of nodes in the particular dense layer .Increasing the number of nodes in each layer increases model capacity

What is mean by activation function?

An activation function in a neural network defines how the weighted sum of the input is transformed into an output from a node or nodes in a layer of the network.

Why we are using sigmoid here?

The main reason why we use sigmoid function is because it exists between (0 to 1). Therefore, it is especially used for models where we have to predict the probability as an output. Since probability of anything exists only between the range of 0 and 1, sigmoid is the right choice. The function is differentiable.

STEP 3: Training our algorithm

Ok, After all this preparation,We are going straight forward to our complex codes

What we are going to do next?

Training our algorithm is the next step…

We need to compile model,

We need to compile the model and we’ll be ready to train.While compile we declare loss function and the optimiser (SGD).

Next,We need to fit the training set in the model…

The model.fit() function of Keras will start the training of the model. It takes the training data(training set), validation data(test set), epochs, and batch size.

It takes some time to train our algorithm.So do have some coffee till it gets completed

Oh,yeah! It’s over.Our algorithm have completed the training.We trained our algorithm of 60000 images upto 20 times as our epoche = 20.