As current generations of algorithms get smarter, they are also becoming more incomprehensible. But to deal with machines that think, we must understand how they think.
We have, perhaps for the first time ever, built machines we do not understand.
We programmed them, so we understand each of the individual steps. But a machine takes billions of these steps and produces behaviors—chess moves, movie recommendations, the sensation of a skilled driver steering through the curves of a road—that are not evident from the architecture of the program we wrote.
We've made this incomprehensibility easy to overlook. We've designed machines to act the way we do: they help drive our cars, fly our airplanes, route our packages, approve our loans, screen our messages, recommend our entertainment, suggest our next potential romantic partners, and enable our doctors to diagnose what ails us. And because they act like us, it would be reasonable to imagine that they think like us too. But the reality is that they don't think like us at all; at some deep level we don't even really understand how they're producing the behavior we observe. This is the essence of their incomprehensibility.
Does it matter? Should we worry that we're building systems whose increasingly accurate decisions are based on incomprehensible foundations?
First, and most simply, it matters because we regularly find ourselves in everyday situations where we need to know why. Why was I denied a loan? Why was my account blocked? Why did my condition suddenly get classified as "severe"? And sometimes we need to know why in cases where the machine truly made a mistake. Why did the self-driving car abruptly go off the road on a clear sunny day? It's hard to troubleshoot problems when you don't understand why they're happening.
But there are deeper troubles too; to talk about them, we need to understand a bit more about how these algorithms work today. They are trained on massive quantities of data, and they are unimaginably good at picking up on the subtle patterns this data contains. We know, for example, how to build systems that can look at millions of identically structured loan applications from the past, all encoded the same way, and start to identify the recurring patterns in the loans that—in retrospect—were the right ones to grant. It's hard to get human beings to read millions of loan applications, and they wouldn't do as well as the algorithm even if they did.
This is a genuinely impressive achievement, but a brittle one. The algorithm has a narrow comfort zone where it can be effective; it's hard to characterize this comfort zone but easy to step out of it. For example, you might want to move on from the machine's success classifying millions of small consumer loans and instead give it a database of loan histories from a few thousand complex businesses. But in doing so, you've lost the ingredients that make the machine so strong—it draws its power from access to a huge number of data points, a mind-numbingly repetitive history of past instances in which to find patterns and structure. Reduce the amount of data dramatically, or make each data point significantly more complex, and the algorithm quickly starts to flail. Watching the machine's successes—and they're phenomenal when the conditions are right—is a bit like marveling at the performance of a prodigy, whose jaw-dropping achievements and unnerving singleness of purpose can mask his or her limitations in other dimensions.
But even in the heart of the machine's comfort zone, its incomprehensible reasoning leads to difficulties. Take the millions of small consumer loan applications again, the structured task where it was doing so well. Trouble arrives as soon as any of the machine's customers, managers, or assistants start asking a few simple questions.
A consumer whose loan was denied might ask not just for an explanation but for something more actionable: "How could I change my application next year to have a better chance of success?" Since we don't have a simple explanation for the algorithm's decision, there tends not to be a good answer to this question. Try to write it so it looks more like one of the loan applications that was granted. Next question.
An executive might ask, "The algorithm is doing very well on loan applications in the United Kingdom. Will it also do well if we deploy it in Brazil?" There's no satisfying answer here either; we're not good at assessing how well a highly-optimized rule will transfer to a new domain.
A data scientist might say, "We know how well the algorithm does with the data it has. But surely more information about the consumers would help it. What new data should we collect?" Our human domain knowledge suggests lots of possibilities, but with an incomprehensible algorithm, we don't know which of these possibilities will help it. In fact, think of the irony: we could try picking the variables we ourselves would find useful. But the machine does not think like us and in fact it's already outperforming us. So how do we know what it will find useful?
This doesn't need to be the end of the story; we're starting to see an interest in building algorithms that are not only powerful but also understandable by their creators. To do this, we may need to seriously rethink our notions of comprehensibility. We might never understand, step-by-step, what our automated systems are doing; but that may be okay. It may be enough that we learn to interact with them as one intelligent entity interacts with another, developing a robust sense for when to trust their recommendations, where to employ them most effectively, and how to help them reach a level of success that we will never achieve on our own.
Until then, however, the incomprehensibility of these systems creates a risk. How do we know when the machine has left its comfort zone and is operating on parts of the problem it's not good at? The extent of this risk is not easy to quantify, and it is something we must confront as our systems develop. We may eventually have to worry about all-powerful machine intelligence. But first we need to worry about putting machines in charge of decisions that they don't have the intelligence to make.