eyes

in Beginner

The difference between Machine Learning and traditional software development – Machine Learning crash course

Observe the sets of inputs and outputs below:

Inputs: [0,1,2,3,4,5]
Outputs: [3,5,7,9,11,13]

Now, are you capable to discover what will be the output if the input is the number 7? If you answered 17, so you discovered the right answer.

In this case, all you have to do is to apply the equation:

y = (x * 2) + 3

But, how do you discover this equation? By the way, I forgot to say that I made this equation up…

This is where machine learning and traditional software development differ.

In traditional dev, we create an algorithm that will transform inputs into outputs. So, we know, in advance, our inputs and the equation that will get the desired outputs.

In machine learning, all we have are the inputs and the outputs. Our job, as devs, is to discover what’s the hidden algorithm between these two elements.

In the end, when you are developing a traditional software, you are actually creating algorithms that will take some input data to produce outputs.

In machine learning, we are trying to discover the actual algorithm in order to discover the outputs of a new input.

We do this by approximation. Given the infinite number of variables present in some real life problems, it’s not always possible to create an algorithm that will get 100% accurate answers.

In machine error, some degree of error is acceptable and desired.

How is this done?

The only way to discover what’s is the equation hidden between two sets of inputs and outputs is by observation.

In machine learning, we call this training.

Training is, in a very simplified wording, observing past relationships between inputs and outputs and trying to guess an outcome in the future.

When you discover these relationships, you get a model.

A model is the equation you are looking for. Some models work very well while others perform very badly. The problem here is that a bad model was not capable to guess this hidden relationship between inputs and outputs. In this case, we say that the model doesn’t generalize well.

Bad generalization can be the result of different problems, but usually, it’s caused by:

  1. Bad data
  2. Missing variables
  3. Missing examples

Variables are the aspects of a problem that are important for figuring out the outcome.

Let’s say that every time it rains, umbrella sales rise. So, raining or not raining is a variable that we have to account if we want to discover what makes umbrella sales rise.

If you imagine data as being a big Excel file, the variables would be the columns and the rows, the examples. We use examples to train our algorithm. More examples usually mean more chances to observe the outcomes of a given income.

Hands on exercise: building your first tensorflow model

In order to illustrate what we just talked, let’s create a model on Tensorflow that will try to figure our equation out.

If you want, you can download the complete Python notebook here: https://github.com/vallantin/Vallant.in—Blog-Posts/blob/master/001_hidden_equation_first_tf_model.ipynb.

The first thing to do is to import the necessary Python libraries.

# import libraries
import tensorflow as tf
import numpy as np 
import matplotlib.pyplot as plt

Then, we will create our examples. The examples are composed of two numpy arrays. One with the inputs and the other one with the outputs.

Since this is only a practise exercise, we already know the equation we are looking for. So let’s use it to generate the output list.

# Create the sets of input and output data
input_data = np.array(range(200), dtype=float)
output_data = [(i * 2) + 3 for i in input_data]
output_data = np.array(output_data, dtype=float)

Now, we can create the model. Our model will have what is called a layer in Tensorflow. Our model will be composed by one single layer and a single neuron, creating a Dense network.

Don’t worry about these names now. You will understand what they are when you start to dive into Tensorflow.

The “input_shape=[1]” specifies that the input to this layer is a single value. The “units” parameter tells Tensorflow how many neurons we are going to use.

The number of neurons should be defined according to how many variables your layer has and what’s the format of the produced output. Since we only have 1 layer, 1 input variable and 1 output, we will use 1 neuron.

# create the model
layer_0 = tf.keras.layers.Dense(units=1, input_shape=[1])
# Assemble the layers and get the model
model = tf.keras.Sequential([layer_0])

It’s time to compile the model. This will give us the ability to define a loss and an optimizer function.

During the training phase, our model will try to guess the output given an input. Then, it will compare the prediction with the real output. It will use the difference between the actual value and the prediction to check if it is too far from the predictions.

The optimizer function will then be used to reduce the loss.

# Compile model
model.compile(loss='mean_squared_error',
              optimizer=tf.keras.optimizers.Adam(0.01))

Now, we can start to train the model:

# train model
history = model.fit(input_data, output_data, epochs=150, verbose=False)
print("Finished training the model")

When training is finished, we can check how the model training statistics. The fit method produces a history object that we can use to plot how the loss model goes down after each training epoch.

One epoch is a full iteration over the sets of examples we are providing. So, since we provide 200 single examples, we will have a total of 100,000 examples being trained here (200 * 500).

# show statistics
plt.xlabel('Epoch')
plt.ylabel("Loss")
plt.plot(history.history['loss'])

Finally, we can use this model to do a prediction. Let’s see if the model is able to identify what should be the output, given an input of 300.

# predict what's the output for the input 300
output_300 = model.predict([300])[0][0]
print('When the input is 300, the output is {:.2f}.'.format(output_300))

Here, you will notice something strange. For me, the model has returned an output of 605.09, but you may see other value. Depending on the input, the configuration of your network etc., the results may vary.

The important thing is that you have a result that’s very close to mine.

That being said, it’s time to check what would be the real value to be returned. Since we know the equation we used to produce the entry values, we can use it to check the output for 300 too.

# let's confirm that we are getting a precise output
x = 300
y = (x * 2) + 3
print('Let\'s confirm that, when the input is 300, the output is {:.2f}.'.format(y))

As you can see, the real value is 603 and our model did very well! The error was of only 2.09.

Is there a way to check the what are the layer variables on this model? Yes! You can use the get_weights method on the layer.

# check the weights for this model
layer_0.get_weights()

The result is:

[array([[2.0123928]], dtype=float32), array([1.3683615], dtype=float32)]

Try to retrain your model to see which predictions are made. Check if the results are closer to the real output and play with the epoch values. Don’t forget to post your results below on the comments area.


Do you want to connect? It will be a pleasure to discuss Machine Learning with you. Drop me a message on LinkedIn.

Leave a Reply