w_daniel_hillis's picture
Physicist, Computer Scientist, Co-Founder, Applied Invention.; Author, The Pattern on the Stone
The Opinions Of Search Engines

Last year, Google made a fundamental change in the way that it searches. Previously a search for, say, "Museums of New York", would return web pages with sequences of letters that matched your search terms, like M-U-S-E-U-M. Now, besides the traditional keyword search, Google also performs a "semantic search" using a database of knowledge about the world. In this case, it will look for entities that it knows to be museums that are located within the geographic region that is named New York. To do this, the computers that perform the search must have some notion of what a museum is, what New York is, and how they are related. The computers must represent this knowledge and use it to make a judgment.

The search engine's judgments are based on knowledge of specific entities: places, organizations, songs, products, historical events, and even individual people. Sometimes these entities are displayed to the right of the results, which combine the findings from both methods of search. Google currently knows about hundreds of millions of specific entities. For comparison, the largest human-readable reference source, Wikipedia, has less than ten million entries. This is an early example of semantic search. Eventually, every major search engine will use similar methods. Semantics will displace the traditional keywords as the primary method of search.

A problem becomes apparent if we change the example from "Museums of New York" to "Provinces of China." Is Taiwan such a province? This is a controversial question. With semantic search either the computer or the curator of the knowledge will have to make a decision. Editors of published content have long made such judgments; now, the search engine makes these judgments in selecting its results. With sematic search these decisions are not based on statistics, but on a model of the world.

What about a search for "Dictators of the World"? Here the results, which include a list of famous dictators, are not just the judgment of whether a particular person is a dictator, but also an implied judgment, in the collection of individual examples, of the very concept of a dictator. By building knowledge of concepts like "dictator" into our shared means of discovering information, we are implicitly accepting a set of assumptions.

Search engines have long been judges of what is important; now they are also arbiters of the truth. Different search engines, or different collections of knowledge, may evolve to serve different constituencies - one for mainland China, another for Taiwan; one for the liberals, another for the conservatives. Or, more optimistically, search engines may evolve new ways to introduce us to unfamiliar points of view, challenging us to new perspectives. Either way, their invisible judgments will frame our awareness.

In the past, meaning was only in the minds of humans. Now, it is also in the minds of tools that bring us information. From now on, search engines will have an editorial point of view, and search results will reflect that viewpoint. We can no longer ignore the assumptions behind the results.