Boltzmann Machines
A Restricted Boltzmann Machine is a 2-layer energy-based model: visible units that hold the data, hidden units that learn features, no connections within a layer. Below the network trains live on tiny digit patterns via contrastive divergence, then you can paint corruption on a digit and watch Gibbs sampling clean it up.
Train an RBM with contrastive divergence
The model's joint energy is with conditionals and . Training pushes weights so the data is low-energy and "fantasies" sampled from the model are high-energy: . Below, 32 hidden units learn 64-pixel features (8×8 digits). Each square in the filter grid is one hidden unit's weight pattern; green is excitatory, red is inhibitory. Watch random noise resolve into stroke detectors over a few hundred CD-1 steps.
Corrupt a digit, then Gibbs-sample it back
Pick a target, paint over pixels to corrupt it, then iterate and . Default is anchored inference: pixels not flipped by the slider are observed and clamped to their true value at each step; the chain only fills in the unknown holes. This survives very high corruption because real evidence keeps it on track. Free mean-field drops the clamps and lets the whole image relax, which collapses toward the prior at high corruption. Stochastic Gibbs samples instead of propagating probabilities, useful for generating but not for denoising.
- Hinton, G., Sejnowski, T. Learning and relearning in Boltzmann machines, in Parallel Distributed Processing, MIT Press, 1986.
- Smolensky, P. Information processing in dynamical systems: Foundations of harmony theory, 1986. (The Restricted Boltzmann Machine, originally "harmonium".)
- Hinton, G. Training products of experts by minimizing contrastive divergence, Neural Computation 14, 2002. PDF
- Hinton, G., Salakhutdinov, R. Reducing the dimensionality of data with neural networks, Science 313, 2006. PDF