foundations of computational agents
Artificial intelligence, or AI, is the field that studies the synthesis and analysis of computational agents that act intelligently. Consider each part of this definition.
An agent is something that acts in an environment; it does something. Agents include worms, dogs, thermostats, airplanes, robots, humans, companies, and countries.
An agent is judged solely by how it acts. Agents that have the same effect in the world are equally good.
Intelligence is a matter of degree. The aspects that go into an agent acting intelligently include
what it does is appropriate for its circumstances, its goals, and its perceptual and computational limitations
it takes into account the short-term and long-term consequences of its actions, including the effects on society and the environment
it learns from experience
it is flexible to changing environments and changing goals.
A computational agent is an agent whose decisions about its actions can be explained in terms of computation. That is, the decision can be broken down into primitive operations that can be implemented in a physical device. This computation can take many forms. In humans, this computation is carried out in “wetware”; in computers it is carried out in “hardware.” Although there are some agents that are arguably not computational, such as the wind and rain eroding a landscape, it is an open question whether all intelligent agents are computational.
All agents are limited. No agent is omniscient (all knowing) or omnipotent (can do anything). Agents can only observe everything in very specialized and constrained domains. Agents have finite memory. Agents in the real world do not have unlimited time to act.
The central scientific goal of AI is to understand the principles that make intelligent behavior possible in natural or artificial systems. This is done by
the analysis of natural and artificial agents
formulating and testing hypotheses about what it takes to construct intelligent agents
designing, building, and experimenting with computational systems that perform tasks commonly viewed as requiring intelligence.
As part of science, researchers build empirical systems to test hypotheses or to explore the space of possible designs. These are distinct from applications that are built to be useful for an application domain.
The definition is not for intelligent thought. The role of thought is to affect action and lead to more intelligent behavior.
The central engineering goal of AI is the design and synthesis of agents that act intelligently, which leads to useful artifacts.
Building general intelligence isn’t the only goal of AI researchers. The aim of intelligence augmentation is to augment human intelligence and creativity. A diagnostic agent helps medical practitioners make better decisions, a search engine augments human memory, and natural language translation systems help people communicate. AI systems are often in human-in-the-loop mode, where humans and agents work together to solve problems. Sometimes the actions of artificial agents are to give advice to a human. Sometimes humans give advice or feedback to artificial agents, particularly for cases where decisions are made quickly or repeatedly.
Artificial intelligence (AI) is the established name for the field, but the term “artificial intelligence” is a source of much confusion because artificial intelligence may be interpreted as the opposite of real intelligence.
For any phenomenon, you can distinguish real versus fake, where the fake is non-real. You can also distinguish natural versus artificial. Natural means occurring in nature and artificial means made by people.
A tsunami is a large wave in an ocean. Natural tsunamis occur from time to time and are caused by earthquakes or landslides. You could imagine an artificial tsunami that was made by people, for example, by exploding a bomb in the ocean, yet which is still a real tsunami. One could also imagine fake tsunamis: either artificial, using computer graphics, or natural, such as a mirage that looks like a tsunami but is not one.
It is arguable that intelligence is different: you cannot have fake intelligence. If an agent behaves intelligently, it is intelligent. It is only the external behavior that defines intelligence; acting intelligently is being intelligent. Thus, artificial intelligence, if and when it is achieved, will be real intelligence created artificially.
This idea of intelligence being defined by external behavior was the motivation for a test for intelligence designed by Turing [1950], which has become known as the Turing test. The Turing test consists of an imitation game where an interrogator can ask a witness, via a text interface, any question. If the interrogator cannot distinguish the witness from a human, the witness must be intelligent. Figure 1.1 shows a possible dialog that Turing suggested. An agent that is not really intelligent could not fake intelligence for arbitrary topics.
There has been much debate about the usefulness of the Turing test. Unfortunately, although it may provide a test for how to recognize intelligence, it does not provide a way to realize intelligence.
Levesque [2014] suggested a new form of question, a Winograd schema after the following example of Winograd [1972]:
The city councilmen refused the demonstrators a permit because they feared violence. Who feared violence?
The city councilmen refused the demonstrators a permit because they advocated violence. Who advocated violence?
These two sentences only differ in one word – feared/advocated – but have the opposite answer.
Winograd schemas have the property that (a) humans can easily disambiguate them and (b) there is no simple grammatical or statistical test that could disambiguate them. For example, the sentences above would not qualify if the phrase “demonstrators feared violence” was much less or more likely than the phrase “councilmen feared violence” independently of the context, and similarly with advocating.
The following examples are due to Davis [2015]:
Steve follows Fred’s example in everything. He [admires/influences] him hugely. Who [admires/influences] whom?
The table won’t fit through the doorway because it is too [wide/narrow]. What is too [wide/narrow]?
Grace was happy to trade me her sweater for my jacket. She thinks it looks [great/dowdy] on her. What looks [great/dowdy] on Grace?
Bill thinks that calling attention to himself was rude [to/of] Bert. Who called attention to himself?
Each of these have their own reasons why one answer is preferred to the other. A computer that can reliably answer these questions needs to know about all of these reasons, and arguably requires the ability to do commonsense reasoning. Common sense should also allow it to reject sentences such as “The doorway won’t fit through the chair because it is too narrow. What is too narrow?”.
Figure 1.2 shows some answers provided by ChatGPT [OpenAI, 2022], based on GPT-3 [Brown et al., 2020], one of the most capable large language models in 2022. ChatGPT gives a different answer each time it is called. You can decide whether it solves this Winograd schema. The technology behind GPT-3 and related models is described in Section 8.5.
Grosz [2012], arguing that language is inherently social and connected to human cooperation, suggested that a more appropriate test should involve purposeful natural language, not language just for the purpose of tricking a human. She suggested the question:
Is it imaginable that a computer (agent) team member could behave, over the long term and in uncertain, dynamic environments, in such a way that people on the team will not notice it is not human?
– Barbara Grosz [2012]
An equal member of the team needs to be trusted enough to act in the world appropriately, know when to ask questions, and when to not act. This challenge also allows for incremental improvement; starting with simple group interactions before moving to complex ones.
Interacting in natural language is not the only aspect of intelligence. An agent acting in an environment needs common sense, “the ability to make effective use of ordinary, everyday, experiential knowledge in achieving ordinary, practical goals” [Brachman and Levesque, 2022b]. Here, knowledge is used in a general way to mean any non-transient information in an agent. Such knowledge is typically not stated in natural language; people do not state what everyone knows. Some knowledge, such as how to ride a bike or recognize a face, cannot be effectively conveyed by natural language. Formalizing common sense has a long history [McCarthy, 1958; Davis, 1990], including the development of representations and actual commonsense knowledge.
The obvious naturally intelligent agent is the human being. Some people might say that worms, insects, or bacteria are intelligent, but more people would say that dogs, whales, or monkeys are intelligent (see Exercise 1.1). One class of intelligent agents that may be more intelligent than humans is the class of organizations. Ant colonies are a prototypical example of organizations. Each individual ant may not be very intelligent, but an ant colony can act more intelligently than any individual ant. The colony can discover food and exploit it very effectively, as well as adapt to changing circumstances. Corporations can be more intelligent than individual people. Companies develop, manufacture, and distribute products where the sum of the skills required is much more than any individual could master. Modern computers, from low-level hardware to high-level software, are more complicated than any single human can understand, yet they are manufactured daily by organizations of humans. Human society viewed as an agent is arguably the most intelligent agent known.
It is instructive to consider where human intelligence comes from. There are three main sources:
Humans have evolved into adaptable animals that can survive in various habitats.
Culture provides not only language, but also useful tools, useful concepts, and the wisdom that is passed from parents and teachers to children.
Humans learn throughout their life and accumulate knowledge and skills.
These sources interact in complex ways. Biological evolution has provided stages of growth that allow for different learning at different stages of life. Biology and culture have evolved together; humans can be helpless at birth, presumably because of our culture of looking after infants. Culture interacts strongly with learning. A major part of lifelong learning is what people are taught by parents and teachers. Language, which is part of culture, provides distinctions in the world that are useful for learning.
When building an intelligent system, the designers have to decide which of these sources of intelligence need to be programmed in, and which can be learned. It is very unlikely that anyone will be able to build an agent that starts with a clean slate and learns everything, particularly for non-repetitive tasks. Similarly, most interesting and useful intelligent agents learn to improve their behavior.