in Beginner

What are Dense Layers?

Dense layers are the most basic and important layers in neural networks. But dense layers are also versatile, since they can be used as hidden layers, as input and output layers.

In some problems, like the Fashion MNIST clothes recognition database, dense layers are used as hidden and output layers. What changes for them is only what we call the activation function.

In fact, deep learning is only deep because of dense layers. The notion of a deep network comes from the fact that dense layers pile up to receive input and return outputs until we find the answer for a problem.

So, the more layers we have, the deeper our network is.

What’s an activation function?

When going through some tutorials about Deep Learning, you can come across the activation parameter when you use a Dense layer.

The activation parameter receives the name of the activation function that you want to use for this layer.

These functions receive the activation name just because, such as in a brain neuron, the activation and deactivation of these neurons are responsible for the way as we deal with information in real world.

The activation function will work on a neural network by copying this natural phenomena: it will do small changes to weights and biases until we have satisfactory outputs.

Hence, these functions will check the value produced by a neuron and decide whether outside connections should use this neuron output or not.

The choice of an activation function will depend on the kind of output you are trying to get. Softmax, for instance, will return a number between 0 and 1, which is good when you need to calculate probabilities.

The Hyperbolic Tangent Function (tanh) outputs a result comprised between -1 and 1.

The image below, from Sebastian Raschka, shows some of the most common activation functions.

What are Dense Layers?
Credit: Sebastian Raschka


Dense layers are great for a lot of simple problems, but they are not good when we are trying to detect patterns and repetitions.

Since they produce the same output vector for a given input vector, they are not suitable to be used in problems where we need different answers on the same input.

Instead, they work great when you need to improve a model and add more layers to it.

Do you want to connect? It will be a pleasure to discuss Machine Learning with you. Drop me a message on LinkedIn.

Leave a Reply