WHAT SHAPE ARE A GERMAN SHEPHERD'S EARS?
[STEPHEN KOSSLYN:] For the last 30 years I’ve been obsessed with a question: What shape are a German Shepherd’s ears? Of course, I'm not literally interested in that question, since if I were I could just go out and look at dogs; I’m really interested in how people answer the question from memory. Most people report that they visualize the dog’s head and mentally "look at" its ears. But what does it mean to visualize something? What does it mean to "look at it in your mind"? It's a bit absurd, because there can't be a little man in there that is actually looking at a picture. If there were, there would have to be a little man inside that man's head, and so forth, and it doesn't make any sense.
For many years we tried to collect objective evidence to show that when you have the experience of visualizing, there’s actually something pictorial in your head. It turned out that the best way to approach this was by turning to the brain. There are parts of the brain that are physically organized such that when you look at something, a corresponding pattern is physically laid out on the cortex. Even the first visual area in the processing stream is often activated during visual imagery — even if your eyes are closed when you visualize. Moreover, the way it's activated depends on what you’re visualizing. If you visualize something that’s vertical, you find activation along the so-called vertical meridian ; if it’s horizontal, the activation flips over on its side. It’s absolutely amazing. Similarly, visualizing objects at different sizes changes the pattern of activation in ways very much like what occurs if you are actually seeing objects at the corresponding sizes.
But I’ve been working on this for over 30 years now and I want to move on. Instead of trying just to establish that there actually are mental images and that these images are bona fide representations that have a functional role in processing systems, I want to ask: So what? Who cares? Why should my mother be interested in this kind of thing?
Lately I’ve been working on something that I'm tentatively calling the "Reality Simulation Principle." It is built on my lab's findings that about two-thirds of the same brain areas are involved in visual mental imagery and visual perception. This finding occurs even when the tasks seem very different on the surface (for example, visualizing an upper case letter in a grid and deciding whether an X mark would fall on the letter if it were actually in the grid versus deciding whether a spoken name is appropriate for a picture). This is a huge amount of overlap, which leads us to suspect that an object seen in a mental image can have the same impact on the mind and body that the actual object would have. My notion is that once the brain systems are engaged, they don't know where the impetus came from. This means that they can produce the same effects whether you activated it endogenously (from information in memory) or exogenously (from looking at something).
The "Reality Simulation Principle" describes how to use mental images as stand-ins for actual objects—to manipulate yourself, basically. It is useful to understand it in conjunction with what I call the GITI cycle, which stands for Generate, Inspect, Transform, Inspect. If mental images can simulate or stand in for actual objects and scenes, you can generate the image, inspect what you’ve got, transform it, and inspect the result. This can be done iteratively, meaning that you can use imagery to take advantage of the "Reality Simulation Principle"to do all sorts of good things for yourself.
What kinds of good things am I talking about? Memory is one obvious example. From the work of Alan Paivio and countless others, we know that you’re able to remember objects better than pictures of objects, and pictures of objects better than words. It also turns out that if you visualize the objects named by words you do better than you would otherwise. Consequently, we're interested now in things like hypnosis. We can hypnotize you, have you visualize an object, and imagine that it’s actually a three-dimensional object, appearing in glorious vivid detail. In this case, your memory would be boosted even further.
Mental practice is another candidate. Neuroscientists such as Marc Jeannerod and Jean Decety have shown that imagining doing something recruits most of the brain mechanisms that would guide the corresponding actual movements. And people in sports psychology have shown that by imagining that you’re engaging in some activity you’ll actually get better at doing it. This process involves generating an image, inspecting the image, transforming it by imagining your movements, seeing what the result would be, and then cycling through again. The next time through you can change the image as a function of the result you saw. If you imagine you’re playing golf, for example, and your ball doesn’t get in the hole, you can imagine what would happen if you whacked it a little more softly. Mental practice clearly works. By understanding how mechanisms of imagery works we can actually optimize this mental practice.
The "Reality Simulation Principle" can also be used to acquire self-knowledge. Try this one out. Imagine it’s dusk, you’re walking alone, and you’re late. You start to walk faster and then notice a short-cut through an alley. It’s getting a little dark, but you really don’t want to be late, so you start to go towards it, and you notice that there are three guys lingering near the mouth of the alley. Now think about a first scenario: The three guys look like they’re 20 years old, are wearing long droopy shorts, dirty t-shirts, baseball caps that are on backwards, and are smoking cigarettes. As you get close, they stop talking, and all three heads swivel and fixate on you and start tracking you. How do you feel?
Now try the same thing, except instead of those three guys, make them three balding middle-age, overweight accountants wearing suits. They’re standing there smoking cigarettes, and their heads swivel as they track you. How do you feel now?
You can start simulating the effects of different attributes. For example, what if the guys are black or Latino teenagers. How do you feel? If you can start actually sorting out your own emotional landscape by running these kinds of mental simulations, you may, in fact, discover certain things about yourself that may be surprising.
Some people who confront their own beliefs may think they’ve got some racial issues, and it may turn out they’re actually class issues. Make those middle-aged accountants black and see how you feel. Those kinds of simulations can help produce self-knowledge, and can help a person to improve his emotional intelligence.
You can also manipulate your body. It’s obvious that if you have a sexual fantasy you manipulate your body by imagery. Also, if you imagine something scary—an anticipated encounter with an authority figure or a walk along a narrow path in the mountains that is starting to crumble—your palms will sweat and your heart beat will change. It’s clear that mental imagery can affect the body, but it turns out that it may be more interesting than that. For example, one of the things we’re studying now is how to change your hormonal landscape by manipulating your images.
There's something called the victory effect, where if you're a male and you win some sort of contest your testosterone goes up afterwards. If you lose, it goes down. This is not a surprise. It also turns out that if you watch your favorite team win your testosterone will go up. If your team loses, it’ll go down. This even works if you’re watching chess, so it’s not about being aroused. In fact, it works for the chess players and for the people who are watching chess games.
Why is this interesting? With men it turns out that spatial abilities vary as a function of testosterone levels. In the fall, males’ testosterone levels are relatively high. They go down thereafter, and then they pick up again. Much research suggests that the relation between testosterone levels and spatial abilities is a U-shaped function; your spatial abilities are not as good if you have too much testosterone or too little testosterone. As you get older, both testosterone levels and spatial abilities drop. There is a lot of evidence that there is a connection between the two. The question is, can we manipulate one’s spatial abilities by having you run simulations of watching yourself win or lose? If "Reality Simulation Principle" is correct, manipulating your own testosterone levels would in turn affect your spatial abilities. This is work in progress in my lab, in collaboration with Peter Ellison and Carole Hooven; stay tuned.
My point is that you can use "Reality Simulation Principle" in lot of different ways, including some ways that are not intuitively obvious, such as manipulating your hormonal landscape. Mental imagery is also important in creativity and problem solving. Einstein reported that most of his thinking was done with images prior to any kind of verbal or mathematical statement. We know quite a bit now about how to use images in the service of solving problems and being creative. In fact, Ron Finke has written a couple of unusually creative books on this topic.
Other people have also claimed that you can even manipulate your health by using what I'm calling the "Reality Simulation Principle". I’m a little skeptical about this. We may look at that eventually, but not right now. It's certainly the case that you can manipulate the placebo effect to some extent, but the medical effects of "Reality Simulation Principle" are probably not huge. It's not going to cure cancer.
My premise has always been that the mind is what the brain does. Of course, that’s a little too glib; really the mind is what the cortex does, since the brain does things like respiration that are not mental. If this is the case, then the question becomes, how do we understand information processing in the brain?
Switching topics, let's return to the role of computation in all of this. The computer is convenient because it allows us to think about how events at different levels of analysis can interact. This is one of the deepest questions in psychology, and probably science in general. It's really a mystery. How is it that semantics and the meaning of things dictate a sequence of events in this wet machine? The wet machine itself has neurons, each of which have an average of ten thousand connections. Sure it's complicated, but ultimately you can understand the whole thing in terms of chemistry and physics.
But how does this machine produce semantically interpretable, coherent sequences of activity, and allow these activities to be modulated by the semantics of what it registers from the world? When you say something to me, it’s obviously not just sound patterns, since the content influences what my brain is doing. How I’m going to respond is a consequence of what my brain did to produce the output.
Let’s think for a moment about physical events such as the status of bytes in the computer. Each bit in each sequence of 8 bits is either on or off. You can physically describe the nature of this machine and the hardware, but you can also think about representation: What does that pattern of physical activity stand for, and represent in its absence? You can think about interpreted rule-based systems, where the representations have an impact on other parts of a system, causing other representations to be formed, or combined, or operated on in various ways, and outputs to be generated. In this regard it's useful to think about computation in a computer to describe how the mind works, even though it’s a wrong metaphor for the brain.
A computer is based on a Von Neumann architecture, where you have a strict separation between memory and the central processing unit. This means that there is a strict separation between operations and representations, which sit passively in memory. The central processing unit is essentially a switching device that uses instructions to dictate what it’s going to do, both in terms of how it interprets successive sets of instructions and what it does with the representations. The very idea of representation depends on how the CPU is set. The exact same pattern of bytes can represent a number, a letter, or part of a picture depending on how it’s being interpreted. Once an operation is performed, the results go back into memory, and serve as input for additional processes. The computer is useful as a way of thinking about all of this, but it’s not going to turn out to be a model of how the brain works; the brain doesn’t work like this at all.
The critical thing about the computer — in thinking about computation as a model for understanding the brain — is that it really lets us think about how advanced the mutual interaction of different levels of analysis is. It’s a wonderfulmystery. How can an idea arise from wet stuff? How can an idea influence what’s going on with the wet stuff? Here, the analogy really works with the computer. We really can think about the notion of representation in the computer, and how it dictates the sequence of physical events through the organization of instructions.
Even though I try to track developments in computational ideas, I'm not known as somebody who espouses the computational model of mind. Nor am I considered a neuroscientist. In fact, as far as I can see, I’m not known as a card-carrying member of any particular approach or subfield. I’ve always been on the fringe. When I was a graduate student I stumbled onto the basic phenomenon I've been studying for 30-odd years now. In my first year of graduate school at Stanford—this was 1970—the idea of semantic memory was really hot. Collins and Quillian had published a simulation model in 1969 in which they claimed that information is stored in long-term memory in the most efficient way possible. (This makes no sense for the brain, by the way, since storage space is apparently not an issue although it is in a computer.) They posited that memories are organized into hierarchies in which you store information in as general a representation as possible. For example, for animals, you've got a representation of animals, and then birds, mammals, reptiles, etc. And then under birds you have canaries, robins, etc. The notion was that you store properties as high up in the hierarchy as you can rather than redundantly duplicating them. For example, birds eat, but so do lizards and dogs, so we store this property higher, up with the concept of animals. You tag the exceptions in a lower level.
One way to test this theory was to look at response times. If you give somebody a statement like "A canary can sing," that information should be stored right in the same place and "canary" and "sing" should be bound together. But if you ask him, "Can a canary eat?" he should have to traverse the network to find a connection between the two (assuming that "eat" is stored up with "animal"). It should take a little longer, and it does! Unfortunately for the model, distance in a semantic net turned out not to be crucial. My first year project at Stanford showed that the response time was just due to how closely associated the terms were, not distance in a net.
One of the experiments was particularly interesting. In one item I asked people to verify the statement: "A flea can bite—true or false?" Two people in a row said false, and afterwards I asked them why. One said that he "looked for" a mouth, and couldn’t find one. The other said he "looked for" teeth and couldn’t "see" any. This idea of "looking for" and "seeing" didn’t fit in at all with Collins and Quillian's network-based computer model, so I started thinking about it. My idea was that maybe imagery has something to do with this. I telephoned everybody whom I’d tested already, and asked them if they had tended to visualize when they were answering the question. Roughly half said they did and about half said they didn’t. I simply plotted the data separately for the two groups. What was dictating the response time of the people who said they didn’t use imagery was how associated the properties were with the objects. For the people who used imagery, that had nothing to do it—it was how big the properties were.
I immediately designed an experiment where I pitted the two characteristics against each other. For example, I asked people, "True or false?: A mouse has a back," which is a trait that is big but not highly associated. I also asked whether it has whiskers, which is small and highly associated, or wings, which is not true. I found that if I instructed people to visualize, the critical thing was how big the properties were. The bigger they were the faster the responses were. If I asked them not to visualize, but to answer intuitively as fast as they could, the pattern reversed. In this case, the response speed depended on how associated the traits were, not how big they were.
The next question was how to think about these results. Fortuitously, at the same time I was doing these experiments I was taking a programming class. This was in the days when you used punch cards. You had to go to the computer center, submit your stack of cards, and stand around looking at a monitor, waiting for your job to come up and see whether it bombed, which you could tell by how long it was running. At the end they gave you a big printout. One of the exercises in the class was to program a set of little modules that generated geometric shapes, like triangles and squares and circles, and to adjust how big they were and where they were positioned. You had to do things like make a Christmas tree by recursively calling the same routine that generated a triangle, and plotting the triangle at different sizes in different positions, overlapping them to produce the design.
As I was doing this, it suddenly occurred to me that this is an interesting model of mental imagery. We could think of imagery as having four main components: It’s got a deep representation, which is an abstract representation in long-term memory; it’s got a surface representation, which is like a display in a cathode ray tube; it’s got generative processes between the two , so the surface geometry is reconstructed in the surface image on the basis of the deep representation; and, finally, it's got interpretative processes that run off the surface image, interpreting the patterns as representing objects, parts or characteristics.
This metaphor was neat, and led me to conduct a lot of fruitful research. But it had the drawback that no matter how hard you hit somebody in the head, you’re not going to hear the sound of breaking glass—there's no screen in there. Even if there were, would just be be back to that problem of the little man looking at the screen. This immediately led me to start thinking about how to program a system where there are arrays that function as a surface image and points that are positioned in space depicting the pattern that represents an object. And then you have something much more abstract that's operated on to produce that.
One of the real virtues of thinking by analogy to the computer is that it focuses you on the idea of processing systems — not just isolated representations or processes, but sets of them working together. Nobody had ever tried to work out in detail what a processing system that uses images would look like. In fact, the few detailed models of imagery that existed all focused on very specific, artificial tasks and tried to model them using standard list-structures — there were no images in the models of imagery. We decided to take seriously the idea that perhaps mental images aren't represented the same way as language; perhaps they really are images. Steve Schwartz and I built a series of simulation models that showed such an approach is not only possible, but allows you to account for much data. We published our first paper on this in ’77, and another in ’78. I also wrote a book on it in 1980 called Image in the Mind, where I worked this out in much more detail than anyone ever cared about. As far as I can tell it had almost no impact. I remember asking one my professors at Stanford about it, and he thought the book was too detailed, and that for somebody to start working on the topic now they’d have to look at it, think about it and get into it, and it was just too much trouble. Psychologists generally don't like having to work with a really detailed theoretical framework, and that was basically the end of it. I have a mild frontal lobe disorder that leads me to perseverate, and thus I've continued to work out the theory and do experiments anyway. My 1994 book on imagery is a direct outgrowth of the earlier work, but now maps it into the brain. The Europeans (especially the French) and Japanese seem interested, if not the Americans.
That said, I should note that lately there are signs that interest in mental imagery might be picking up again. This might be a result of another round in my old debate with Zenon Pylyshyn. He’s a good friend of Jerry Fodor, but unlike Fodor, Pylyshyn has maintained forever that the experience of mental images is like heat thrown off by a light bulb when you’re reading: It's epiphenomenal, it plays no functional role in the process. Pylyshyn believes that mental images are just language-like representations and that it’s an illusion that there’s something different about them. He published his first paper in 1973. Jim Pomerantz and I replied to it in 1977 and the debate has just been rolling along ever since. Pylyshyn has great distain for neuroscience, to put it mildly. He thinks it's useless, and has no bearing at all on the mind.
I really don’t know what brings him to this conclusion. I suspect it’s because he is one of the few (less than 2 percent of the population) people who does not experience imagery. He apparently doesn't even "get" jokes that depend on images. He also probably rejects the very idea of imagery on the basis of of his intuitions about computation, based on Von Neumann architecture. He's clearly aware that computers don’t need pictorial depictive representations. His intuitions about the mind may be similar. But this is all speculation.
Pylyshyn is not only against theories that are rooted in neural mechanisms (he thinks theories of the logical structure of language should be a model for all other types of theories… really!), he's also against neural network computational models. I've probably published eight to ten papers using network models. At one point in my career I did work on the nature of spatial relations. I had the idea that there are actually two ways to represent relations among objects. One is what I call categorical, where a category defines an equivalence class. Some examples of this would be "left of," "right of," "above," "below," "inside," "outside," etc. If you are sitting across from me, from your point of view, this fist is to the right of this open palm, and that is true for all these different positions [moving his hand about, always to the right of the vertical axis created by his fist]. "Right of" defines a category, and even though I move my hand around, all of these positions are treated as equivalent.
This is useful for doing something like recognizing a hand, since the categorical spatial relations among my fingers, palm, digits, and joints do not change. That's handy for recognizing objects because if you store a literal picture in memory, an open-palm gesture might match well, but if I make another gesture with my hand, say clenching it, this would not match well — so you want something more abstract.
Categorical spatial relations are useful computationally for that problem, but they’re not useful at all for reaching or navigating. Just knowing that a fist is to the left of this palm won’t allow me to touch it precisely; I’ve got to know its exact location in space. If I’m walking around the room knowing the table’s in front of me, "in front of" is a categorical relation and thus is true for an infinite number of positions relative to it. This is not good enough for navigating. Thus, I posit a second type of spatial relation, which I call coordinate: Relative to some origin, the metric distance and direction is specified.
In my lab we have shown that the left cerebral hemisphere is better at encoding categorical spatial relations, which makes sense because categories are often language-based. On the other hand (hemisphere), the right hemisphere is better at encoding coordinate spatial relations. This is true in normal people, it's true when we and others have done neuroimaging, and it's true when you look at deficit sin patients who have brain lesions. We've constructed a whole raft of neural network models that showed that, in fact, if you split a model—a network—into two separate streams (one for each type of representation), it does better than if you have a single system trying to make both categorical and coordinate representations. The point is not so much that the hemispheres are different, but rather that the brain relies on two distinct ways to code spatial relations. This claim caused a mini-controversy. I'm delighted to see that in a recent issue of the Journal of Cognitive Neuroscience researchers (who I don't know) tested over a hundred people after they "turned off" one hemisphere at a time for medical reasons, and showed that with challenging tasks where you have to make categorical versus coordinate spatial relations judgments, the laterality effects I predicted worked beautifully. If it was too easy it didn't work, which also fits perfectly with our modeling and previous experiments— so it looks like this controversy has been settled (but experience has taught me that those are famous last words….).
This is really just one little corner of what I do, and ultimately is related to my imagery work. I’ve always argued that imagery has to be understood in a system that includes language-like propositional representations as well as depictive representations. I don’t think of the mind as purely imaginal. That can't be true. It's got to depend on coordinating many different types of representations that interact in intricate and interesting ways. The distinction between the two types of representations invites a further distinction between different forms of imagery, which make use of the different sorts of spatial relations. And in fact we have evidence for such a distinction. One clear conclusion from all this: "Imagery" isn't just "one thing."
Getting back to computational theorizing in psychology per se. From my perspective—and maybe I’m missing something—computational theorizing has reached a plateau. That’s not to say there isn’t progress, but it’s incremental and is currently within a paradigm that was set perhaps ten years ago. I don’t see any revolutionary work out there. Right now, the connectionists are probably the leaders in computational theorizing relevant to the brain. David Rummelhart did terrific work. Terry Sejnowski is excellent, as is Jay McClelland. These are people who have been at it for years. I don't see too much really new on the horizon.
In terms of interesting theorizing, Dan Dennett and Steve Pinker and their colleagues are trying to cash out the evolutionary psychology program. Instead of trying to think about behaviors as being the products of evolution, they are thinking about how the modular structure of information processing in the brain is a consequence of evolution. That's an interesting program. My objection is that this enterprise is not particularly empirical. Science is the process of finding things out. You've got to go out and do studies to find things out. It's very helpful to have theories as a base from which you can direct your attention to issues and questions, but then you’re got to go do the actual research.
If you asked me to explain the direction of mind science writ large, I'd say that what you’re going to see is a bridging between cognitive neuroscience—where the mind is conceived of as what the brain does—and genetics. Those are the two really hot areas right now, and there’s a giant gulf between them.
I was recently writing a introductory psychology textbook chapter on intelligence, and read a lot of behavioral genetics. I was really struck by the fact that these guys are trying to bridge the gap from genes to behavior in one fell swoop, and they’re not doing that good a job at it. They're not doing that well in linkage studies that try to connect variability in a behavior with variability in different types of alleles. Sometimes they manage 2% of the variance. It occurred to me that they’re leaving out the middle man. They want to think in terms of the model: genes —> behavior. But it would be much better to think in terms of: genes —> brain, and then brain —> behavior. Genes influence behavior and cognition via what they do to the brain. Thinking about this has gotten me very interested in genetics, but not in the sense that genetics is a blueprint. Most genes functioning in the adult brain seem to be up-regulated and down-regulated by circumstances. They turn on and off.
Here’s an example developed by Steve Hyman that can serve as a metaphor: If you want to build muscles you lift weights. If the weight is heavy enough it’s going to damage the muscles. That damage creates a chemical cascade and reaches into the nuclei of your muscle cells, and turns on genes that make proteins and build up muscle fibers. Those genes are only turned on in response to some environmental challenge. That’s why you’ve got to keep lifting heavier and heavier weights. The phrase, "No pain no gain," is literally true in this case. Interaction with the environment turns on certain genes which otherwise wouldn’t be turned on; in fact, they will be turned off if certain challenges aren’t being faced. The same is true in the brain. Growing new dendritic spines, or even replenishing neurotransmitters, is linked to genes that are being turned on and turned off in response to what the brain is doing, which in turn is responding to environmental challenges.
I'm really interested in how genes allow the brain to respond to the tasks at hand. When genes are turned on and off, this affects what neurons are doing; which then, of course, affects how blood is allocated; in turn, affecting cognition and behavior. There is a gigantic project yet to be done that will have the effect of rooting psychology in the rest of natural science. Once this is accomplished, you'll be able to go from phenomenology—things like mental imagery—to information processing—thinking about things you can model on the computer—to the brain—thinking about how a particular kind of information processing arises from this particular brain we have—down through the workings of the neurons, including the biochemistry, all the way to the biophysics and the way that genes are up-regulated and down-regulated.
This is going to happen; I have no doubt at all. When it does we’re going to have a much better understanding of human nature than is otherwise going to be possible. If you want to understand evolution, the residue of evolution is the genes. Why not study the genes if you want to understand the reasons behind the brain's organization? There are reasons we have those genes rather than other ones—that’s where the evolutionary story comes in. But my particular brain or your particular brain is the way it is not only because of the particular genes we have, but also because of the way the environment up-regulated or down-regulated those genes during development, sculpting our brains certain ways, and the ways our genes respond to environmental and endogenous challenges. All of this is empirically tractable. The tools are available, the questions are clear, and we know what sort of answers to seek. Time to get cracking!