frank_wilczek's picture
Physicist, MIT; Recipient, 2004 Nobel Prize in Physics; Author, Fundamentals
Hidden Layers

When I first took up the piano, merely hitting each note required my concentrated attention. With practice, however, I began to work in phrases and chords. Eventually I was able to produce much better music with much less conscious effort.

Evidently, something powerful had happened in my brain.

That sort of experience is very common, of course. Something similar occurs whenever we learn a new language, master a new game, or get comfortable in a new environment. It seems very likely that a common mechanism is involvedf. I think it's possible to identify, in broad terms, what that mechanism is: We create hidden layers.

The scientific concept of a hidden layer arose from the study of neural networks. Here a little picture is worth a thousand words:

In this picture, the flow of information runs from top to bottom. Sensory neurons — the eyeballs at the top — take input from the external world and encode it into a convenient form (which is typically electrical pulse trains for biological neurons, and numerical data for the computer "neurons" of artificial neural networks). They distribute this encoded information to other neurons, in the next layer below. Effector neurons — the stars at the bottom — send their signals to output devices (which are typically muscles for biological neurons, and computer terminals for artificial neurons). In between are neurons that neither see nor act upon the outside world directly. These inter-neurons communicate only with other neurons. They are the hidden layers.

The earliest artificial neural networks lacked hidden layers. Their output was, therefore, a relatively simple function of their input. Those two-layer, input-output "perceptrons" had crippling limitations. For example, there is no way to design a perceptron that, faced with pictures of a few black circles on a white background, counts the number of circles. It took until the 1980s, decades after the pioneering work, for people to realize that including even one or two hidden layers could vastly enhance the capabilities of their neural networks. Nowadays such multilayer networks are used, for example, to distill patterns from the explosions of particles that emerge from high-energy collisions at the Large Hadron Collider. They do it much faster and more reliably than humans possibly could.

David Hubel and Torstein Wiesel were awarded the 1981 Nobel Prize in physiology or medicine for figuring out what neurons in the visual cortex are doing. They showed that successive hidden layers first extract features of visual scenes that are likely to be meaningful (for example, sharp changes in brightness or color, indicating the boundaries of objects), and then assemble them into meaningful wholes (the underlying objects).

In every moment of our adult waking life, we translate raw patterns of photons impacting our retinas — photons arriving every which way from a jumble of unsorted sources, and projected onto a two-dimensional surface — into the orderly, three-dimensional visual world we experience. Because it involves no conscious effort, we tend to take that everyday miracle for granted. But when engineers tried to duplicate it, in robotic vision, they got a hard lesson in humility. Robotic vision remains today, by human standards, primitive. Hubel and Wiesel exhibited the architecture of Nature's solution. It is the architecture of hidden layers.

Hidden layers embody, in a concrete physical form, the fashionable but rather vague and abstract idea of emergence. Each hidden layer neuron has a template. It becomes activated, and sends signals of its own to the next layer, precisely when the pattern of information it's receiving from the preceding layer matches (within some tolerance) that template. But this is just to say, in precision-enabling jargon, that the neuron defines, and thus creates, a new emergent concept.

In thinking about hidden layers, it's important to distinguish between the routine efficiency and power of a good network, once that network has been set up, and the difficult issue of how to set it up in the first place. That difference is reflected in the difference between playing the piano, say, or riding a bicycle, or swimming, once you've learned (easy), and learning to do those things in the first place (hard). Understanding exactly how new hidden layers get laid down in neural circuitry is a great unsolved problem of science. I'm tempted to say it's the greatest.

Liberated from its origin in neural networks, the concept of hidden layers becomes a versatile metaphor, with genuine explanatory power. For example, in my own work in physics I've noticed many times the impact of inventing names for things. When Murray Gell-Mann invented "quarks", he was giving a name to a paradoxical pattern of facts. Once that pattern was recognized, physicists faced the challenge of refining it into something mathematically precise and consistent; but identifying the problem was the crucial step toward solving it! Similar, when I invented "anyons" I knew I had put my finger on a coherent set of ideas, but I hardly anticipated how wonderfully those ideas would evolve and be embodied in reality. In cases like this, names create new nodes in hidden layers of thought.

I'm convinced that the general concept of hidden layers captures deep aspects of the way minds — whether human, animal, or alien; past, present, or future — do their work. Minds mobilize useful concepts by embodying them in a specific way, namely as features recognized by hidden layers. And isn't it pretty that "hidden layers" is itself a most useful concept, worthy to be included in hidden layers everywhere?