Boltzmann Machine – Boltzmann Machine

Published by SuperDataScience Team

December 19, 2018

Live Training With Hadelin

Discover the 5 steps to Unlock your Career!

Days
Hours
Minutes
Seconds

Boltzmann Machine

(For the PPT of this lecture Click Here)


We have already covered a number of topics, which you can look at as well.
We have earlier discussed about Artificial Neural Network, which is used for regression and classification, convolutional neural networks used for computer vision, recurrent neural networks used for time series analysis and self-organizing maps that is used for future detection.
The self-organizing maps was our first topic of unsupervised deep learning and since we found it much interesting today we will be looking at the Boltzmann Machine, another unsupervised deep learning.
We will dig deep into the core of the functioning, structure, and design of the machine trying to live no stone unturned.
After all, that is what makes deep learning, deep learning. The diagram below gives an illustration of the topics we have already covered.
From the far right side is the Artificial Neural Network, then the convolutional neural network, recurrent neural network and finally is the self-organizing map.
The self-organizing map is a type of an unsupervised type of deep learning model but still, it operates in a direction like the other three models making the Boltzmann machine to stand out.
The Boltzmann machine has connections too but it is so interesting since it does not operate in a direction, it is an undirected model.
As seen in the screenshot below, the connections of the Boltzmann Machine has no arrows so it operates on a random direction.
What makes the Boltzmann Machine stand out?
The blue nodes are visible nodes, while the red ones or pink forgive my color blindness, are the hidden nodes.
These neural networks don’t have an output layer and everything is connected to everything then as aforementioned this machine has no direction.
Many tutorials found in online will at this stage jump to the Restricted Boltzmann Machine maybe because of their simplicity and they are more practical.
But it is more interesting to understand the complex part so that everything else will eventually fall in place.
How Boltzmann machine function
To make it easier, the visible nodes are all connected between each other so adjusting the weights is not needed, the input data is fixed. Boltzmann machines generate data they don’t rely only on input data.
The visible nodes will receive input data and simplify it and at the same time, the invisible nodes too will generate their own raw data, bring it to the system and bring out the results.
I guess I’m sounding confusing instead of convincing so to bring us on the same boat, let me explain using the example that Geoffrey Hinton once used of a nuclear power plant shown below.  
Boltzmann machine explained
This diagram as simple as it looks, it illustrates a number of activities and parts that coordinate to make the nuclear power plant function. The system is made with many components and different structures that make its functioning complete.
Let us focus on the containment Building, the pump and the turbine and the electricity output.
Why are we focusing on this? Well, it is because this are areas that operators of the power plant pay attention to and ensure that their functioning is standard. For instance, the temperature in the containment building should not be too high or too low. The pressure exerted on the pump too should be on average. The rotation of the turbine should also be measured to ensure that they are operating on the normal speed.
Finally, the electricity output should be the amount that is expected from the input that was used in the first place. However, these not all the components that affect the functioning of this nuclear power plant, there is more that is involved and these components are usually overlooked or rather we can say no one really cares so long as everything else mentioned above is working well.
To explain this with the Boltzmann Machine, the components that the power plant operators are interested to measure is like the data that the Boltzmann Machine receives from the visible nodes (the blue nodes in our case).
But the Boltzmann machine will not stop there because that is what it has been commanded to measure.
It will use its hidden nodes (the red ones) to determine other factors that the operators overlook. Lets us take a look at the diagram again.
In this case, our example that is, the overlooked components are the moisture of the soil (the green part of the below diagram) where the control rods, uranium fuel, reactor vessel, the pump and the condenser lay.
Such is overlooked yet it is important as well and the speed of the wind too, the blue section) though sometimes it can be measured.
The Boltzmann Machine is a representation of a science system and we may not input some values which are important in the system.
It looks at overlooked states of a system and generates them. Therefore, it is not a deterministic deep learning model, the Boltzmann machine is a scholastic or generative deep learning model because it has a way of generating its own deep learning model. I hope we have all boarded the same boat now!
Intuitive deep learning of the Boltzmann Machine
We feed the data into the visible nodes so that the Boltzmann machine can generate it.
The Boltzmann machine will not require you to input supervised data.
With some machines, most of the machines actually, we don’t have the luxury of blowing them out so that we can find data to input into the Boltzmann machine, a case that is needed with the restricted Boltzmann machine.
Therefore, we will input good behaviors that make the machine to function well; for instance, the correct temperature of the containment unit and the right pressure of the pump.
The Boltzmann machine, using its hidden nodes will generate data that we have not fed in. Then it will come up with data that will help us learn more about the machine at hand, in our case the nuclear power plant, to prevent the components that will make the machines function abnormally.
The Boltzmann machine nodes are all connected so it measures all components equally treating data from both the hidden and visible nodes equally. It does not give priority to the data that it has been fed with neither does it prioritize data that the hidden nodes have generated.
The Boltzmann machine gives all the data priority and from that, it brings out results from all the data.
That explained there is more we need to look at; constructive divergence and Deep belief networks so stick around for information, and since it is interesting, more fun!

Share on

Related Blogs