Superintelligence: Paths, Dangers, Strategies

Acquiring values

Capability control is not a permanent control method. Therefore, the question that Bostrom seeks to address here is that due to the infeasibility of restraining superintelligence forever, humans should strive to make the superintelligence adopt values that it will follow when trying to achieve its final goals.

The value-loading problem

Correspondingly, it is impossible to contemplate all likely circumstances that occur through the actions of the superintelligence. Likewise, it is impossible to develop “a list of all possible worlds and assign each of them a value.”

In order to ensure that the superintelligence makes a good decision, a utility function can be largely beneficial. A utility function “assigns value to each outcome that might obtain, or more generally to each ‘possible world’.” The superintelligence would then identify every time the action with the highest expected utility. (“The expected utility is calculated by weighting the utility of each possible world with the subjective probability of that world being the actual world conditional on a particular action being taken.”) Despite an action not being able to be calculated exactly due to the countless possible actions, the utility function that ensures the actions of the superintelligence can establish a normative ideal, an optimality notion.

Evolutionary selection

Evolution is a specific class of search algorithms that include the alternation of two steps, one growing a population of solution candidates by creating new candidates such as sexual recombination and the second step is reducing the population by excluding candidates that received diminished results in evaluation tests.

Reinforcement learning

Reinforcement learning is “an area of machine learning that studies techniques whereby agents can learn to maximize some notion of cumulative reward.”

An example of reinforcement learning is a program that can learn to play backgammon by using this type of learning to incrementally enhance its evaluation of possible board positions.

Motivational scaffolding

Other possible solutions to the value-loading problem could be motivational scaffolding. Motivational scaffolding provides the seed AI with an interim goal system, involving simplistic final goals that the programmers can outline with precise coding. As the AI grows more intelligent, the interim goal system can be substituted with one that has different final goals. “This successor goal system then governs the AI as it develops into a full-blown superintelligence. Because the scaffold goals are not just instrumental but final goals for the AI, the AI might be expected to resist having them replaced (goal-content integrity being a convergent instrumental value).”

Nevertheless, as a control method, a scaffold motivation system could include the goal of allowing online guidance from the programmers, so that they can replace any of the AI’s current goals. Additionally, it could contain being transparent to the programmers about its values and strategies.

Value learning

An essential approach to the value-loading problem is utilising the AI’s intelligence to learn the values humans intend it to strive for. In order for this to be achieved, programmers could provide the AI with an implicit set of adequate values. The AI can then, with its estimates and calculations, choose an action suited towards those values.

An AI with human-level general intelligence, can have the following final goal: ‘Maximize the realization of the values described in the envelope.’ The AI cannot at first know what is written in the envelope. “But it can form hypotheses, and it can assign those hypotheses probabilities based on their priors and any available empirical data. For instance, the agent might have encountered other examples of human-authored texts, or it might have observed some general patterns of human behavior. This would enable it to make guesses.”

For clarification purposes, the challenge is not whether the AI can comprehend human intentions, since a superintelligence should easily establish such understanding. Instead, the challenge is humanity making sure that the AI will be motivated to strive for the values as they were intended by human programmers.

Emulation modulation

Another challenge is when one could negatively influence the motivational state of an emulation by providing the digital equivalent of psychoactive substances (or, in the case of biological systems, the actual chemicals). Even with current technologies, it is feasible to pharmacologically manipulate/negatively influence values and motivations to a limited extent.

Institution design

We could also develop agents to form an institution with each other, the motivations of this kind of systems rely on the motivations of the sub-agents as well as on how those sub-agents are organized. Bostrom provided the example of “a group that is organized under strong dictatorship could behave as if it had a will that was identical to the will of the subagent that occupies the dictator role, whereas a democratic group might sometimes behave more as if it had a will that was a composite or average of the wills of its various constituents.”

Productivity Readers

Search This Blog

The Ride of a Lifetime - Entire Book Summary

Superintelligence: Paths, Dangers, Strategies - Chapter 12

Comments

Post a Comment