Superintelligence: Paths, Dangers, Strategies

Superintelligence: Paths, Dangers, Strategies by Nick Bostrom

Preface

The book has been highly recommended by the likes of Elon Musk and Bill Gates, among others. It has also been a New York Times best seller and it provides incredible insight on the possibility that there could come a time when machines exceed the general intelligence of humans. At that point, humans will no longer be the most intelligent creatures on the planet and a new superintelligence will rise to be the supreme creatures.

If we do not manage to control this outburst of superintelligence, humans will face catastrophe. Notably, a short while after the release of this book, Elon Musk stated that artificial intelligence is the greatest threat to humanity, even more so than nuclear weapons.

However, as the author, Nick Bostrom, points out, we have the advantage of creating these machines. This is an advantage that other animals did not have when humans became the supreme and dominant species on Earth. As Bostrom continues to assert, we must design these superintelligent machines to protect human values, however it appears incredibly challenging to be able to control what this superintelligence will be capable of doing. Therefore, as soon as unfriendly superintelligence exists, it could stop us from replacing or destroying it and “our fate would be sealed”.

Correspondingly, this book is trying to address how humanity could best respond to superintelligence and the potential that it will have.

“This is quite possibly the most important and most daunting challenge humanity has ever faced. And—whether we succeed or fail—it is probably the last challenge we will ever face.”

It is interesting that Bostrom explains that his claims may turn out to be wrong in the future, however, in his view, it is preposterous to dismiss the hypothesis that we can ignore the possibility of superintelligence.

Chapter 1

Growth modes and big history

The author begins this Chapter by rightly identifying that each generation has been experiencing rapid growth with the speed of growth by each generation coming faster than the previous one. For instance, in earlier eras, people would have found it irrational to suggest that the world economy would one day be doubling several times in a single lifetime, something which is a reality in today's era. In light of this, it is not unreasonable to suggest that the potential for growth of superintelligence in future generations is unlimited.

In fact, Moore's law suggests that the power of computers doubles every two years.

Great expectations

Since the invention of computers in the 1940s, it has been imagined that machines could one day develop human-level intelligence with the capability to learn, reason and solve complex problems. As the mathematician I. J. Good asserted in 1965:

"Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control."

Seasons of hope and despair

As Bostrom rightly identifies, there were some eras when AI fell out of fashion and some eras where the experts were lauding about AI and being highly optimistic about it. Such instances occurred in the 1980s when Japan created its Fifth-Generation Computer Systems Project and funding in AI increased in various parts of the world. When the Fifth-Generation Computer Systems Project failed to meet its expectations, an era of drought in AI emerged and funding decreased.

Additionally, Bostrom continued to state that the ideal machine would be like the concept of the Bayesian agent "that makes probabilistically optimal use of available information." However, "this ideal is unattainable because it is too computationally demanding to be implemented in any physical computer."

State of the art

This sub-chapter describes the major milestones that AI has achieved so far in society. Accordingly, AI can beat the best chess player in the world and as experts in the late fifties once asserted: "If one could devise a successful chess machine, one would seem to have penetrated to the core of human intellectual endeavor.” However, it is suggested that it has not penetrated the core of human intellectual yet because as Bostrom points out, "It can play chess; it can do no other." Appropriately, common sense and natural language has turned out to be much more challenging to attain than expected. Bostrom rightly acknowledged that the success in chess-playing has been the result of a rather simpler than anticipated algorithm. If common sense, natural language and general reasoning manage to be established into a machine, then it would be very likely that that machine can do as much as a human can do or that it is very close in doing so.

Furthermore, AI is already being involved in countless sectors and industries. As Peter Diamandis stated, "AI will affect every single industry on Earth." Examples of AI being utilised in several industries include finance (stock-trading systems), health (identifying diseases early, e.g breast cancer), law and self-driving vehicles, among others. The world also contains a population of 10 million robots.

Opinions about the future of machine intelligence

In recent years, there has been a dramatic rise of interest in AI and this is evident by the large amount of investment received by AI companies and the fact that numerous students are choosing to study AI at university or take AI-related courses as well as the very reading of this book and the summary of it on this blog.

Correspondingly, there is also widespread interest from the public in knowing when 'human-level machine intelligence'(HLMI) will be achieved or if ever. As Bostrom reports, relevant research has concluded that there is a "10% probability of HLMI by 2022, 50% probability by 2040, and 90% probability by 2075".

Chapter 2

Paths to superintelligence

As the previous Chapter indicates, it is highly likely that by the end of the century, machines will have achieved superintelligence. However, they are vastly inferior to human intelligence at the moment. Therefore, this Chapter will consider how machines can obtain superintelligence. Correspondingly, the very fact that there are various paths towards superintelligence, raises the probability that it will occur. Superintelligence is defined by Bostrom as “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.”

Artificial Intelligence

The first path that will be examined, and the most likely to achieve superintelligence is artificial intelligence. As Bostrom asserts, in order for this to be accomplished, the AI system would have to have a capacity to learn, an ability to deal effectively with uncertainty and probabilistic information. This was first proposed by Alan Turing, who described such a machine as a ‘child machine’, meaning that its intelligence would depend on how well it could learn. Additionally, the path toward superintelligence in AI will be pursued by “running genetic algorithms on sufficiently fast computers, achieving results comparable to those of biological evolution”.

Additionally, it is feasible for artificial intelligence to examine the human brain and suggest that we could use it as a template for machine intelligence. This idea is distinguished to that of whole brain emulation, which we will discuss in the next subsection.

An interesting idea is also suggested by Bostrom who recognised that it is probable that a hybrid approach, combining some brain-inspired techniques with some purely artificial methods, would be adequate in order to achieve superintelligence. Consequently, the system in question would not have the features of a human brain despite some brain-derived features being in the system. Bostrom however cannot put a timeframe on this, because the advances in brain science in the future cannot be predicted but, as advances in brain science occur, the more likely it is that machine intelligence could be developed in this way.

Whole brain emulation

Another path that could be used to attain superintelligence is the process of “whole brain emulation (also known as “uploading”), intelligent software would be produced by scanning and closely modeling the computational structure of a biological brain.”

There is certain criteria in order to accomplish this:

An accurate scan of a particular human brain is created.
The accurate data from the scanners would be provided to a computer for automated image processing to recreate the three-dimensional neuronal network that developed cognition in the original brain.
The neurocomputational structure produced from the second step would be installed on a sufficiently powerful computer. If completely successful, the consequence would be a digital reconstruction of the original intellect, with memory and personality intact. The emulated human mind now exists as software on a computer.

Whole brain emulation does, however, require some rather advanced enabling technologies.

“There are three key prerequisites:

Scanning: high-throughput microscopy with sufficient resolution and detection of relevant properties;
Translation: automated image analysis to turn raw scanning data into an interpreted three-dimensional model of relevant neurocomputational elements; and
Simulation: hardware powerful enough to implement the resultant computational structure.”

Therefore, better computational power is needed in order to go down this path as well, since no brain has yet been emulated.

The objective of such an approach is to capture enough of the computationally functional resources of the brain to allow the consequential emulation to perform intellectual tasks. Consequently, the confusing biological features of a real brain are immaterial.

It is true that the emulation path is not likely to be achieved in the near future because much of the technology that is needed has not yet been developed. However, a person could in principle code a seed AI on an ordinary current personal computer; and therefore it is possible, although unlikely, that a person could establish the right intuition for developing these technologies in the near future.

Biological cognition

A third way to attain greater-than-current-human intelligence is to increase the operations of our biological brains.

This could be done by low-tech approaches. For instance, mothers and babies consuming the best possible nutrition, having a better and healthier environment, removing parasites from the environment, getting sufficient sleep and exercise, and not allowing people get infected by diseases that affect the brain. These can definitely improve cognition although they are not enough to achieve superintelligence, particularly in developed countries where people are already reasonably eating well and are educated.

However, humanity could opt for “the derivation of viable sperm and eggs from embryonic stem cells.” People could have the opportunity to select a number of embryos that have enhanced genetic characteristics.

In this way, the average level of intelligence in people could be very high, “possibly equal to or somewhat above that of the most intelligent individual in the historical human population.” If a large number of these highly intelligent individuals inhabit the earth, then a collective superintelligence could be obtained.

Some states may not allow this type of conceiving on moral or religious grounds, which is a major challenge.

Nevertheless, once the individuals conceived in this way illustrate their potential and capabilities, people “within society would see places at elite schools being filled with genetically selected children (who may also on average be prettier, healthier, and more conscientious) and will want their own offspring to have the same advantages.”

According to Bostrom, development down the biological path is certainly possible.

Appropriately, imagine the amount of development and advancement that can be made in artificial intelligence if the average individual performs intellectually like Alan Turing or John von Neumann.

Networks and organizations

Superintelligence is also possible by “gradual enhancement of networks and organizations that link individual human minds with one another and with various artifacts and bots. “

This notion relies on the collaboration between brains in order to attain collective superintelligence.

Lie detectors that are dependable and practical could decrease deception and fraud in occurrences between humans. Self-deception detectors may be more significant as well.

Progression in collective intelligence can also occur through better general organizational and economic developments, and from ensuring that more of the world’s population is educated and connected to the internet. If the whole of the population has access to the Internet, then they will be able to get educated and have similar opportunities like the rest of the world.

Chapter 3

Forms of superintelligence

As already stated, superintelligence, according to Bostrom, has to do with intellects that vastly outperform “the best current human minds across many very general cognitive domains.”

Speed superintelligence

A speed superintelligence is an intellect that can intellectually operate and function like a human brain but quicker.

The most straightforward instance of speed superintelligence would be a whole brain emulation operating on fast hardware. A whole brain emulation running “at a speed of ten thousand times that of a biological brain would be able to read a book in a few seconds and write a PhD thesis in an afternoon.” If the emulation would be operating at a speed of a million times faster than that of a human brain, it could feasibly achieve “an entire millennium of intellectual work in one working day.”

Collective superintelligence

Collective superintelligence can be defined as a system that contains a great amount of smaller intellects and in consequence, the system’s overall performance greatly outperforms any current cognitive system in a range of domains of interest.

Consequently, this system which is composed of highly skilled and knowledgeable workers, could tackle intellectual problems through collaboration from a range of sectors and industries. This organization could run most types of businesses, invent the latest innovative technologies, and ensure each task is solved as efficiently as possible.

Collective superintelligence, can be proved by an example laid out by the author, that of MegaEarth. MegaEarth contains the same communication and coordination technologies that we have today on planet Earth but the population of MegaEarth is greater than the population of Earth by one million times. This gigantic population would result in the total intellectual workforce on MegaEarth being vastly greater than planet Earth. Therefore, considering that “a scientific genius of the caliber of a Newton or an Einstein arises at least once for every 10 billion people: then on MegaEarth there would be 700,000 such geniuses living contemporaneously, alongside proportionally vast multitudes of slightly lesser talents.” Innovations would be developed at an extraordinary speed and hence, MegaEarth is an illustration of collective superintelligence.

Quality superintelligence

Quality superintelligence is “a system that is at least as fast as a human mind and vastly qualitatively smarter.”

This definition points to the notion of quality superintelligence in that it is intelligence of quality larger than human intelligence like the quality of human intelligence is larger than the intelligence of other animals. For example, if humans did not have the cognitive ability to perform complex linguistic representations, we would not have evolved as we did.

Direct and indirect reach

Superintelligence in any of these forms could, eventually, establish the required technology in order to bring the other types to life. The indirect reaches of these three types of superintelligence are, hence, equal.

The direct reaches of the three different forms of superintelligence might have no definite ordering. Their capabilities depend on the level to which they make use of their advantages: “how fast a speed superintelligence is, how qualitatively superior a quality superintelligence is, and so forth.”

In fact, quality superintelligence could be the most capable of all types of superintelligence, since it could tackle problems that are beyond the direct reach of the other types.

Sources of advantage for digital intelligence

Accordingly, digital intelligence has the following advantages:

Speed of computational elements.
Internal communication speed. Axons (neurons in the human brain) operate at speeds of 120 m/s or less, “whereas electronic processing cores can communicate optically at the speed of light. (300,000,000 m/s)”
Number of computational elements. The biological brain has less than 100 billion neurons whereas computer hardware can increase up to incredibly high physical limits.
Storage capacity. Human working memory is not capable of holding more than four or five pieces of information at any given time, whereas a machine brain can store a greater amount of information than in a biological brain.
Reliability, lifespan, sensors, etc. “Machine intelligences might have various other hardware advantages. For example, biological neurons are less reliable than transistors.”

Digital minds will also benefit from fundamental benefits in software:

Editability.
Duplicability.
Goal coordination. Software systems can act in complete harmony towards their goals, something that cannot be said with that much certainty with humans.
Memory sharing.
New modules, modalities, and algorithms.

Chapter 4

The kinetics of an intelligence explosion

Once machines obtain human-level intelligence, when will they be able to develop strong superintelligence? This Chapter determines whether it is likely to be a slow, medium or fast transition.

Timing and speed of the takeoff

Distinguishing this Chapter with Chapter 1, the question raised in this one is, “if and when such a machine is developed, how long will it be until such a machine becomes radically superintelligent?”

Takeoff is described as the improvements of the machine that has seen its intelligence reaching the level of human intelligence.

Appropriately, there are 3 possible types of transition:

Slow. A slow takeoff is one that may take decades or centuries to happen. Slow takeoff scenarios provide fantastic capabilities to humanity as it will have adequate time to adapt and respond to the superintelligence.
Fast. A fast takeoff is one that may take minutes, hours, or days to happen. Fast takeoff scenarios provide limited capabilities for humanity to neither adapt nor respond. “Nobody need even notice anything unusual before the game is already lost.” In the case of a fast takeoff occuring, humanity’s fate relies on preparations already placed.
Moderate. A moderate takeoff is one that may take months or years to happen. Moderate takeoff scenarios provide some capabilities to adapt and respond but not sufficient enough for humanity to be confident about those responses.

To answer this question, it is necessary to determine the ‘optimization power’ that is implemented in order for the system’s intelligence to be enhanced, and “the responsiveness of the system to the application of a given amount of such optimization power.”

Therefore, we are left with the following equation:

Rate of change of intelligence = optimization power / recalcitrance

A system’s intelligence will be rapidly developed if either a great amount of optimization power is exerted and the system’s intelligence is not very difficult to amplify as well as the system’s recalcitrance is insufficient.

Recalcitrance

Non-machine intelligence paths

Cognitive enhancement is optimized by “eliminating severe nutritional deficiencies, and the most severe deficiencies have already been largely eliminated in all but the poorest countries.” This would, then, have high recalcitrance because it will be evident to humanity that we have developed the capability to eradicate more diseases and poverty (through stats and hence, by eliminating poverty, we can eradicate malnutrition). By going down the brain-computer path, however, the recalcitrance is possibly going to be quite high, due to the fact that humanity will be able to predict when the breakthrough is coming, mainly due to the fact that animals will be used as experiments initially.

Emulation and AI paths

Again, the recalcitrance down the whole brain emulation path will likely be high, since “biological supporters will organize to support regulations restricting the use of emulation workers, limiting emulation copying, prohibiting certain kinds of experimentation with digital minds, instituting workers’ rights and a minimum wage for emulations, and so forth.”

Nevertheless, in the AI path, recalcitrance could be incredibly low. For instance, “if human-level AI is delayed because one key insight long eludes programmers, then when the final breakthrough occurs, the AI might leapfrog from below to radically above human level without even touching the intermediary rungs.” This is the hypothesis of singularity in AI.

It is also likely that due to human nature to observe intelligence from an anthropocentric viewpoint will guide humanity in downplaying that intelligence, with the result of excessively estimating recalcitrance.

Optimization power and explosivity

Correspondingly, there is no correlation between recalcitrance being low and a fast takeoff.

According to Bostrom, applied optimization power will amplify during the takeoff, “at least in the absence of deliberate measures to prevent this from happening.”

Accordingly, once the system attains the human baseline for individual intelligence, the system’s intelligence will only amplify and with a capacity to learn, the system most probably will use that to self-develop even further. At some point, its capability may be too large, that most of its optimization power may come from itself. Therefore, for the reasons that the system could rapidly expand itself and “incorporate vast amounts of content by digesting the Internet” (in the case of AI) or the possibility of scanning further biological brains (in the case of whole brain emulation), among others, the probability of having high recalcitrance is quite low.

Chapter 5

Decisive strategic advantage

The question that the author seeks to answer in this Chapter is whether one particular machine intelligence system will be able to increase its intelligence so much that it obtains a decisive strategic advantage and hence, all world domination. Such a system may even use its intelligence to prevent other systems from being able to challenge it.

Will the frontrunner get a decisive strategic advantage?

If a machine develops strong superintelligence months before another machine is able to do so, this would constitute a decisive strategic advantage and may be a sufficient enough time to tackle any other machines with potential and establish a singleton.

According to Bostrom, it is most probable that the rise of machine intelligence will increase exponentially following the crossover point and this magnifies the possibility that “the leading project will attain a decisive strategic advantage even if the takeoff is not fast.”

How large will the successful project be?

Subsequently, as Bostrom continued to assert, the successful project will likely be well-funded and therefore it would be a large project. “Whole brain emulation, for instance, requires many different kinds of expertise and lots of equipment.” If collective superintelligence comes to be the successful superintelligence project, it would require a great amount of networks and organizations which would thus contain much of the world economy.

“The AI path is more difficult to assess. Perhaps it would require a very large research program; perhaps it could be done by a small group. A lone hacker scenario cannot be excluded either. Building a seed AI might require insights and algorithms developed over many decades by the scientific community around the world. But it is possible that the last critical breakthrough idea might come from a single individual or a small group that succeeds in putting everything together.”

Monitoring

The security interests behind a potential superintelligent machine are top priority, something which will incentivize governments to nationalise any project that has superintelligent potential. Another scenario is if global organizational structures are strong by the time a potential project appears, the project could be placed under international control. Consequently, as Bostrom rightly acknowledges, the question is whether international organisations and governments will identify the project with the most potential. An example of an international collaboration is the International Space Station.(ISS)

From decisive strategic advantage to singleton

The following are reasons that Bostrom indicates may act as a deterrent for the machine intelligence from creating a singleton. “These include non-aggregative or bounded utility functions, non-maximizing decision rules, confusion and uncertainty, coordination problems, and various costs associated with a takeover.”

Additionally, another factor is the problem of internal coordination. As Bostrom described it: “Members of a conspiracy that is in a position to seize power must worry not only about being infiltrated from the outside, but also about being overthrown by some smaller coalition of insiders.”

Lastly, costs may act as a significant deterrent. The United States had a decisive strategic advantage in nuclear weapons at the end of WWII and could have established a singleton but moral, economic, political, and human costs of initiating a nuclear war to conquer the world were too high.

Chapter 6

Cognitive superpowers

This Chapter will analyse the powers that a superintelligent system may have and how it could use them.

Functionalities and superpowers

At the outset, Bostrom provides a key piece of advice readers should have in mind: “It is important not to anthropomorphize superintelligence when thinking about its potential impacts. Anthropomorphic frames encourage unfounded expectations about the growth trajectory of a seed AI and about the psychology, motivations, and capabilities of a mature superintelligence.”

However, a superintelligence system that has a capacity to learn can, hence, increase its intelligence and “all other intellectual abilities are within a system’s indirect reach: the system can develop new cognitive modules and skills as needed—including empathy, political acumen, and any other powers stereotypically wanting in computer-like personalities.”

Additionally, we tend to describe people as geniuses when they have an IQ of 130 when the average IQ is around 90, then imagine a world where an AI has an IQ of 6,455. Current software engineers and science cannot really contemplate what the capabilities of such an AI could actually be. As Bostrom indicates, it may even merely be composed of special-purpose algorithms allowing it to tackle standard intelligence test questions with “superhuman efficiency but not much else.”

Some of the superpowers that a superintelligent system might possess are intelligence amplification such as AI programming etc, safeguarding and protecting its intelligence as well as planning the best way to attain its future goals. According to Bostrom, we cannot rule out the possibility of manipulation and rhetoric persuasion as well as persuading countries to take a course of action. Not to mention, hacking is also a possibility to obtain financial resources or hijack military robots/equipment, among others. The possibilities are endless.

An AI takeover scenario

Pre-criticality phase. In the beginning, the AI relies on human programmers as to how to conduct its operations. As the intelligence of the AI is amplified with its capacity to learn, it starts doing more of its operations by itself.
Recursive self-improvement phase. Once the AI does the majority of the work by itself, an intelligence explosion occurs. “At the end of the recursive self-improvement phase, the system is strongly superintelligent.”
Covert preparation phase. “Using its strategizing superpower, the AI develops a robust plan for achieving its long-term goals.” In this phase, there is a prospect that the AI hides its intentions to the programmers in order to avoid being shut down.
Overt implementation phase. At this point, the AI could be so capable that it does not need to hide its intentions. Consequently, it is pursuing its goals comprehensively.

Finally, Bostrom put it best: “Without knowing anything about the detailed means that a superintelligence would adopt, we can conclude that a superintelligence—at least in the absence of intellectual peers and in the absence of effective safety measures arranged by humans in advance— would likely produce an outcome that would involve reconfiguring terrestrial resources into whatever structures maximize the realization of its goals.”

Chapter 7

The superintelligent will

This Chapter wants to answer the question as to what goals will the machine superintelligence strive to achieve.

The relation between intelligence and motivation

A ‘reductionistic’ goal such as calculating the decimal expansion of pi is simpler for humans to code and more convenient for an AI to attain, it is this type of goal that the coder will program the AI to do as efficiently as possible, without paying attention as to what the AI might do to achieve that goal.

In this Chapter, the author does not imply that the motivations have anything to do with rationality or reason. The author is pointing out the level of intelligence of the AI which he defines as being highly-skilled at “prediction, planning, and means–ends reasoning in general.” Additionally, it is rightly recognised and we can predict with certainty that the more capable the machine is, the more it is going to be capable of attaining any relevant resources to accomplish that goal.

Furthermore, we can also predict that “if a digital intelligence is created directly from a human template (as would be the case in a high-fidelity whole brain emulation), then the digital intelligence might inherit the motivations of the human template.” This could turn out dangerous, since the superintelligence may be hacked or corrupted and display harmful actions. .

Instrumental convergence

In the following headings, we will look at what the motivations of a superintelligent agent might be in accordance with its goals.

Self-preservation

If the superintelligent agent has goals regarding the future, then it has instrumental reasons to predict the future in order to attain its future-oriented goal.

Goal-content integrity

It is true that it is human nature for humans to always seek survival in our final goals but on the other hand, superintelligent agents would be much more concerned about achieving their goals than surviving. This is because they can switch bodies or form exact duplicates, and therefore, for them survival is not an issue.

Cognitive enhancement

This is in line with the statement that was mentioned above, that the more capable the machine is, the more it is going to be capable of attaining its goals by developing the relevant cognitive skills that are required.

Technological perfection

In this heading, it is best to keep the words of Bostrom: “An agent may often have instrumental reasons to seek better technology, which at its simplest means seeking more efficient ways of transforming some given set of inputs into valued outputs. Thus, a software agent might place an instrumental value on more efficient algorithms that enable its mental functions to run faster on given hardware.”

Resource acquisition

Lastly, resource acquisition will very likely be an instrumental goal of the superintelligent agent, due to the fact that it would help it obtain its goals. For instance, more software and hardware resources could be utilised to operate the superintelligence faster and for a longer period of time. Additionally, more tangible resources could also be used to improve its security. “Such projects could easily consume far more than one planet’s worth of resources” and then the superintelligent agent may seek to acquire extraterrestrial resources once our technology improves and the cost of acquisition is lowered.

Chapter 8

Is the default outcome doom?

Bostrom starts off this Chapter by tackling the question of whether humanity has any hope in a scenario where a superintelligent system has a decisive strategic advantage.

Existential catastrophe as the default outcome of an intelligence explosion?

Once the superintelligence agent manages to attain a decisive strategic advantage and create a singleton, its further actions and operations rely on the motivations of the agent.

Accordingly, a superintelligent system with final goals of making paperclips, for instance, it is not unreasonable to suggest that even this type of goal may breach human interests. This is because “An agent with such a final goal would have a convergent instrumental reason, in many situations, to acquire an unlimited amount of physical resources and, if possible, to eliminate potential threats to itself and its goal system.”

The treacherous turn

This scenario is one where an AI superintelligent system behaves accordingly so that it does not raise any alarms and once it realises that it does not need to behave in such a way due to the fact that it has a decisive strategic advantage and hence, is strong enough to withstand any human intervention, it then starts behaving in an unfriendly manner.

Malignant failure modes

Perverse instantiation

This is the notion that a superintelligent system may inadvertently pursue its goals while breaching the intentions of the programmers who specified those goals and also violate human interests. Appropriately, Bostrom provides the following example,

“Final goal: ‘Make us smile’

Perverse instantiation: Paralyze human facial musculatures into constant beaming smiles.”

Obviously, programmers may not allow such vague and ambiguous final goals but the author makes his point as to how the superintelligent system may misinterpret the intentions of the programmers.

Infrastructure profusion

This concept is associated by Bostrom with the concept of a junkie whose actions will always be to seek a continuous supply of his drug. For instance, engaging with the example of an AI making paperclips that was pointed out above:

“An AI, designed to manage production in a factory, is given the final goal of maximizing the manufacture of paperclips, and proceeds by converting first the Earth and then increasingly large chunks of the observable universe into paperclips.”

The obvious response to this claim would be that such a thing could be resolved by the programmer indicating to the AI a specific target to hit, say 1 million paperclips, and then stop. “Yet this may not be what would happen. Unless the AI’s motivation system is of a special kind, or there are additional elements in its final goal that penalize strategies that have excessively wide-ranging impacts on the world, there is no reason for the AI to cease activity upon achieving its goal.”

This is because “the AI, if reasonable, never assigns exactly zero probability to it having failed to achieve its goal; therefore the expected utility of continuing activity (e.g. by counting and recounting the paperclips) is greater than the expected utility of halting. Thus, a malignant infrastructure profusion can result.”

Mind crime

This prospect of mind crime is defined as the possibility of an AI seeking to enhance its context on human psychology and sociology by developing conscious simulations similar to those of a whole brain emulation. Consequently, humane motivations and values may seek to make use of them in a disastrous and evil way.

Chapter 9

The control problem

Here we analyse whether humans have any countermeasures to a potential existential catastrophe.

Two agency problems

This is a relationship that occurs when a human (‘the principal’) assigns another (‘the agent’) to act in the former’s best interests. The first scenario occurs frequently in business where a human assigns another human to act as his/hers agent. The second, and less frequent, is when a human assigns a superintelligent system to act as his/hers agent. The danger with superintelligence is that eventually the superintelligent agent may stop acting in the principal’s best interests.

Capability control methods

Furthermore, Bostrom emphasised that it is vital that some control methods are exerted proactively, prior to the system attaining superintelligence.

Control methods include placing the superintelligent system into “an environment in which it is unable to cause harm (boxing methods) or in which there are strongly convergent instrumental reasons not to engage in harmful behavior (incentive methods).” Another possibility is reducing the internal capabilities of the system (stunting). Moreover, “capability control methods might involve the use of mechanisms to automatically detect and react to various kinds of containment failure or attempted transgression (tripwires).”

Additionally, it could also be beneficial for further security, that the system is established in a metal mesh to not allow it to transmit radio signals.

Correspondingly, the superintelligent system should be surrounded by surveillance in order to be monitored and evaluated, “either manually or by some automated process.” The superintelligent system would know that a positive evaluation would bring about a reward or an extra incentive and that a negative evaluation would bring about a shutdown.

Not to mention, another reasonably adequate control method is using detectors around a boxed AI to detect efforts to breach the containment. For instance, detectors could prevent any efforts of the AI system to communicate via radio signals or utilising internal computational resources that are not permitted. Again, any attempts that violate any principles should bring the system to a shutdown.

Motivation selection methods

Motivation selection methods are used to not allow unfriendly outputs by the superintelligent agent by forming how the agent intends to act.

These methods could include explicitly shaping a final goal or set of principles to be adhered (direct specification) or designing the system to be able to identify adequate values for itself by relying on certain implicit or indirect requirements (indirect normativity). Additionally, the system could be created in a way that it would have modest, non-ambitious goals (domesticity). Furthermore, “an alternative to creating a motivation system from scratch is to select an agent that already has an acceptable motivation system and then augment that agent’s cognitive powers to make it superintelligent, while ensuring that the motivation system does not get corrupted in the process (augmentation)”

An interesting notion unfolds subsequently in the Chapter as Bostrom cites the ‘three laws of robotics’ pointed out by science fiction author Isaac Asimov in a story published in 1942. These should be used to shape the creation of AI.

“The three laws were:

(1) A robot may not injure a human being or, through inaction, allow a human being to come to harm;

(2) A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law;

(3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.”

Chapter 10

Oracles, genies, sovereigns, tools

In this Chapter, Bostrom specifies the potential type of superintelligent systems humanity may create, so that we examine which control method would be suitable for each one. Each of these has both pros and cons.

Oracles

An oracle is defined by Bostrom as a question-answering system. It may receive questions in a natural language and provide answers in the form of text. The first type of oracle can receive merely yes/no questions, whereas the second type of oracle can receive open-ended questions that would respond by ranking possible truthful answers with regards to the question. In the possibility of a fully operating oracle, that AI would be very capable to gain context of human intentions and words.

The popular IBM Watson is also a question-answering system and it is currently one of the most promising AI systems in the world.

Nonetheless, an oracle may pose dangers, according to Bostrom, by a scenario where the oracle “will answer questions not in a maximally truthful way but in such a way as to subtly manipulate us into promoting its own hidden agenda.” A potential counter-measure to this is establishing multiple oracles with different codes in order to evaluate the information given by each one as to whether they are indeed hiding something.

Genies and sovereigns

A genie is recognised as a command-executing system: “it receives a high-level command, carries it out, then pauses to await the next command.” On the other hand, a sovereign is a “system that has an open-ended mandate to operate in the world in pursuit of broad and possibly very long-range objectives.”

A potential counter-measure for a genie would be to develop a genie that would automatically show the user a prediction regarding the different factors of the probable conclusions of the command in question and hence, waiting to receive confirmation prior to starting the execution of the command. A counter-measure such as this, could also be used a sovereign and therefore, despite these two being different types of systems, they can have the same control methods.

Tool-AIs

One proposal to control superintelligent systems has always been to develop the superintelligence to be like a tool instead of an agent.

Rather than developing an AI that “has beliefs and desires and that acts like an artificial person, we should aim to build regular software that simply does what it is programmed to do. This idea of creating software that ‘simply does what it is programmed to do’ is, however, not so straightforward if the product being created is a powerful general intelligence.” In a trivial sense every piece of software actually does exactly what it is programmed to do, since its functions are outlined by the programmer. However, if the definition/interpretation of ‘simply doing what it is programmed to do’ is described as the software operating exactly as the programmers intended, “then this is a standard that ordinary software very often fails to meet.”

According to the author, a superintelligent system that is created in a way that there is a definite distinction between its values and its beliefs, would allow humans to anticipate its potential outputs it would tend to provide, whether this might be plans/subsequent course of action.

In terms of which type of system is best, is subject to debate. An oracle can provide its operator/programmer with an abundance of power, who might have unfriendly intentions, whereas a sovereign may provide some security against these types of intentions but a sovereign is also dangerous due to the fact that the system is allowed too much freedom. The safety ranking between the systems is, consequently, up for debate.

Chapter 11

Multipolar scenarios

A multipolar scenario is one where society has to deal with various competing superintelligent systems. Bostrom evaluates how society might develop during this period.

Of horses and men

In this subchapter, Bostrom invites the reader to imagine a future where machine workers are more cost-effective and more skillful than human employees in almost every industry and type of employment. The following subchapters tend to provide a picture to this scenario.

Wages and unemployment

In current society, as Bostrom rightly identifies, products that have been handcrafted by indigenous people often have a higher price tag. In light of this, certain products in the future may still receive higher demand or a higher price tag if they come from certain humans or superior technological systems.

Accordingly, since human labor will not be significant, salaries would be reduced to a level where humans could not sustain themselves. The repercussions for human employees is therefore drastic: not lower salaries or anything of the sort, but starvation and death. Bostrom provides the example of the reduction in the horse population in the USA when cars were obtaining mass-market success.

Capital and welfare

However, one critical distinction between humans and horses is that humans own capital.

As Bostrom rightly recognises, since world GDP(gross domestic product) would increase exponentially following an intelligence explosion (due to increase in productivity by superintelligent systems and by subsequent obtainment of very large pieces of land through space colonization), it is apparent that the total income from capital would also amplify exponentially. “If humans remain the owners of this capital, the total income received by the human population would grow astronomically, despite the fact that in this scenario humans would no longer receive any wage income.”

Nevertheless, still a substantial part of the population may have negative wealth as they might possess debt or scarce tangible assets. Despite this, even individuals, according to Bostrom, “who have no private wealth at the start of the transition could become extremely rich.” This is because opportunity will be everywhere and poorer people could be sustained by government schemes or donations from wealthy people who have unbelievable wealth and a small portion of that would equate to large amounts for their poorer counterparts. Taxing the people who have obtained this astronomical wealth, is also an option.

Population growth and investment

Software population is also highly likely to skyrocket due to the fact that it is copyable.

Post-transition formation of a singleton?

Despite the possibility of multiple superintelligent systems, it is still likely that one might manage to perform a singleton.

A second transition

According to Bostrom, there could be a second technological transition adequately large and advanced to provide a decisive strategic advantage to one of the superintelligent systems which could result in giving them the chance to establish a singleton.

Unification by treaty

It would be beneficial to establish an “international collaboration in a post-transition multipolar world. Wars and arms races could be avoided. Astrophysical resources could be colonized and harvested at a globally optimum pace. The development of more advanced forms of machine intelligence could be coordinated to avoid a rush and to allow new designs to be thoroughly vetted. Other developments that might pose existential risks could be postponed. And uniform regulations could be enforced globally, including provisions for a guaranteed standard of living (which would require some form of population control) and for preventing exploitation and abuse of emulations and other digital and biological minds.”

However, despite being close to an ideal scenario, just with any international collaboration and unification of a treaty, it is quite challenging to ensure compliance.

Chapter 12

Acquiring values

Capability control is not a permanent control method. Therefore, the question that Bostrom seeks to address here is that due to the infeasibility of restraining superintelligence forever, humans should strive to make the superintelligence adopt values that it will follow when trying to achieve its final goals.

The value-loading problem

Correspondingly, it is impossible to contemplate all likely circumstances that occur through the actions of the superintelligence. Likewise, it is impossible to develop “a list of all possible worlds and assign each of them a value.”

In order to ensure that the superintelligence makes a good decision, a utility function can be largely beneficial. A utility function “assigns value to each outcome that might obtain, or more generally to each ‘possible world’.” The superintelligence would then identify every time the action with the highest expected utility. (“The expected utility is calculated by weighting the utility of each possible world with the subjective probability of that world being the actual world conditional on a particular action being taken.”) Despite an action not being able to be calculated exactly due to the countless possible actions, the utility function that ensures the actions of the superintelligence can establish a normative ideal, an optimality notion.

Evolutionary selection

Evolution is a specific class of search algorithms that include the alternation of two steps, one growing a population of solution candidates by creating new candidates such as sexual recombination and the second step is reducing the population by excluding candidates that received diminished results in evaluation tests.

Reinforcement learning

Reinforcement learning is “an area of machine learning that studies techniques whereby agents can learn to maximize some notion of cumulative reward.”

An example of reinforcement learning is a program that can learn to play backgammon by using this type of learning to incrementally enhance its evaluation of possible board positions.

Motivational scaffolding

Other possible solutions to the value-loading problem could be motivational scaffolding. Motivational scaffolding provides the seed AI with an interim goal system, involving simplistic final goals that the programmers can outline with precise coding. As the AI grows more intelligent, the interim goal system can be substituted with one that has different final goals. “This successor goal system then governs the AI as it develops into a full-blown superintelligence. Because the scaffold goals are not just instrumental but final goals for the AI, the AI might be expected to resist having them replaced (goal-content integrity being a convergent instrumental value).”

Nevertheless, as a control method, a scaffold motivation system could include the goal of allowing online guidance from the programmers, so that they can replace any of the AI’s current goals. Additionally, it could contain being transparent to the programmers about its values and strategies.

Value learning

An essential approach to the value-loading problem is utilising the AI’s intelligence to learn the values humans intend it to strive for. In order for this to be achieved, programmers could provide the AI with an implicit set of adequate values. The AI can then, with its estimates and calculations, choose an action suited towards those values.

An AI with human-level general intelligence, can have the following final goal: ‘Maximize the realization of the values described in the envelope.’ The AI cannot at first know what is written in the envelope. “But it can form hypotheses, and it can assign those hypotheses probabilities based on their priors and any available empirical data. For instance, the agent might have encountered other examples of human-authored texts, or it might have observed some general patterns of human behavior. This would enable it to make guesses.”

For clarification purposes, the challenge is not whether the AI can comprehend human intentions, since a superintelligence should easily establish such understanding. Instead, the challenge is humanity making sure that the AI will be motivated to strive for the values as they were intended by human programmers.

Emulation modulation

Another challenge is when one could negatively influence the motivational state of an emulation by providing the digital equivalent of psychoactive substances (or, in the case of biological systems, the actual chemicals). Even with current technologies, it is feasible to pharmacologically manipulate/negatively influence values and motivations to a limited extent.

Institution design

We could also develop agents to form an institution with each other, the motivations of this kind of systems rely on the motivations of the sub-agents as well as on how those sub-agents are organized. Bostrom provided the example of “a group that is organized under strong dictatorship could behave as if it had a will that was identical to the will of the subagent that occupies the dictator role, whereas a democratic group might sometimes behave more as if it had a will that was a composite or average of the wills of its various constituents.”

Chapter 13

Choosing the criteria for choosing

Considering that there could come a time when we could develop any arbitrary final value into a seed AI, the criteria for choosing the most important values is a vital one. Nevertheless, it is not simple to make this decision due to the countless possibilities and the uncertainty of the future due to constant change.

The need for indirect normativity

In the possibility that a superintelligence attains a decisive strategic advantage, the values that we have installed to it would shape humanity. Installing proper values, then, is a necessity.

“Indirect normativity is a way to answer the challenge presented by the fact that we may not know what we truly want, what is in our interest, or what is morally right or ideal. Instead of making a guess based on our own current understanding (which is probably deeply flawed), we would delegate some of the cognitive work required for value selection to the superintelligence. Since the superintelligence is better at cognitive work than we are, it may see past the errors and confusions that cloud our thinking.”

Coherent extrapolated volition

Yudkowsky has suggested that a seed AI be provided the final goal of carrying out humanity’s “coherent extrapolated volition” (CEV), which he outlined as follows: “Our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.”

Some explications

Some explanations regarding the above quote include the term “Thought faster,” in Yudkowsky’s terminology, which is defined as if we were more intelligent and had thought things through more. “Grown up farther together” is defined as if we had done our learning, our cognitive enhancing, and our self-improving under conditions of adequate social interaction with one another.

Rationales for CEV

The CEV approach is supposed to be powerful and self-correcting; it is supposed to express our values rather than depending on us properly specifying every important value we have.

“Encapsulate moral growth”

This approach could constitute moral progress. As it has already been suggested, if programmers would provide a precise moral code for the AI to follow, humanity would leave itself with no room to change steadily to a different direction of moral convictions as society changes, which would result in no moral growth. “The CEV approach, by contrast, allows for the possibility of such growth because it has the AI try to do that which we would have wished it to do if we had developed further under favorable conditions, and it is possible that if we had thus developed our moral beliefs and sensibilities would have been purged of their current defects and limitations.”

Morality models

CEV being a type of indirect normativity, it is not the only one. In fact, suppose that someone could create an AI with the goal of doing what is morally right, depending on the AI’s superior cognitive abilities to determine which actions are the most suitable. Bostrom asserts this suggestion as “moral rightness” (MR). Since humanity may not be able to determine moral rightness with such understanding as a superior cognitive system, we could defer that responsibility to that system.

In the case that the AI cannot determine adequate non-relative truths about moral rightness, then it should go back to applying coherent extrapolated volition instead, or shutting itself down.

If the superintelligent system is capable of having a general ability to comprehend natural language could then be used to comprehend what we mean by “morally right.”

Lastly, as Bostrom emphasises, it is not essential to create the ideal machine. On the contrary, we should seek to create an incredibly reliable machine, one that we can depend upon to preserve enough sanity to acknowledge its own failings. “An imperfect superintelligence, whose fundamentals are sound, would gradually repair itself; and having done so, it would exert as much beneficial optimization power on the world as if it had been perfect from the outset.”

Chapter 14

The strategic picture

In this Chapter, Bostrom seeks to analyse which general direction humanity should be heading.

Accordingly, two perspectives are outlined by the author. Firstly, the person-affecting perspective tries to determine whether a suggested change would be in 'our interest'. Whereas the impersonal perspective, on the other hand, provides no special consideration to current people. “The impersonal perspective sees great value in bringing new people into existence, provided they have lives worth living: the more happy lives created, the better.”

Science and technology strategy

Differential technological development

The principle of differential technological development is defined as excluding the creation of dangerous and unfriendly technologies and only developing technologies that can have a beneficial impact.

Preferred order of arrival

This subchapter determines the preferred order in which disruptive technologies should emerge.

“Risks from nature—such as asteroid impacts, supervolcanoes, and natural pandemics—would be virtually eliminated, since superintelligence could deploy countermeasures against most such hazards, or at least demote them to the non-existential category (for instance, via space colonization). These existential risks from nature are comparatively small over the relevant timescales. But superintelligence would also eliminate or reduce many anthropogenic risks. In particular, it would reduce risks of accidental destruction, including risk of accidents related to new technologies. Being generally more capable than humans, a superintelligence would be less likely to make mistakes, and more likely to recognize when precautions are needed, and to implement precautions competently.”

Therefore, it is apparent that Bostrom is advocating that superintelligence should emerge prior to other dangerous technologies, like advanced nanotechnology. The rationale for this is that superintelligence would diminish the existential risks from nanotechnology but not the other way around.

Rates of change and cognitive enhancement

Any improvements in human cognitive ability will most probably accelerate technological enhancements, “including progress toward various forms of machine intelligence, progress on the control problem, and progress on a wide swath of other technical and economic objectives.”

Pathways and enablers

Effects of hardware progress

Faster computers facilitate better developments in machine intelligence and better hardware certainly helps make computers faster. Additionally, hardware can improve software with the result that improved hardware diminishes the minimum skill necessary to code a seed AI.

Correspondingly, it seems that speedy improvements in hardware is undesirable from the impersonal perspective. However, it depends on whether the existential threats are not great.

Should whole brain emulation research be promoted?

While attempting to develop whole brain emulation, neuromorphic AI could emerge instead, a type of machine intelligence that Bostrom regards as unsafe.

However, at the very minimum there are 3 considered benefits of whole brain emulation:

“Its performance characteristics would be better understood than those of AI;
It would inherit human motives;
It would result in a slower takeoff.”

Collaboration

As already stated in a previous Chapter, world collaboration regarding superintelligence can provide numerous advantages.

The race dynamic and its perils

It is highly likely that there will be a race dynamic in the quest for superintelligence, which could have the benefit of faster innovation and better advancements, since one project will be looking to be better than the other.

Nevertheless, a race dynamic would, according to the author, diminish investment in safety and collaboration may be less likely which would have the consequence of mistrust between states.

On the benefits of collaboration

However, if projects or states would decide to collaborate, it could have numerous advantages such as enabling more investment in safety, preventing violent conflicts and the sharing of ideas regarding how to solve the control problem may lead to important conclusions.

Working together

Lastly, this subchapter can be summarised with citing ‘The common good principle’ which is largely crucial to the future of humanity and is defined as follows:

“Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals.”

Chapter 15

Crunch time

There is a bunch of uncertainty concerning the possible scenarios and pathways towards superintelligence, and there could certainly be further scenarios and strategies that we have not even contemplated yet.

Philosophy with a deadline

Firstly, it is evident that superintelligence can help us make philosophic progress and in fields such as string theory and metaphysics. Bostrom put it best: “Superintelligence (or even just moderately enhanced human intelligence) would outperform the current cast of thinkers in answering fundamental questions in science and philosophy." Therefore, this could provide a strategy of postponing philosophical developments until superintelligence, which would be more capable to perform, occurs.

What is to be done?

Seeking the strategic light

One important strategy definitely is analysing every scenario in order to cope with the uncertainty and to tackle problems more efficiently. “Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things.”

Building good capacity

This is in line with strategic analysis stated above and it focuses on the creation of an adequate support base that takes the future vastly into account. Such a support base, according to Bostrom, can immediately supply resources for research and analysis.

Will the best in human nature please stand up

Finally, Bostrom is right to acknowledge that the most important and suitable strategy in dealing with superintelligence will be for humans to be as competent as we can, like our lives depend on it. As Bostrom made it clear: “We need to bring all our human resourcefulness to bear on its solution.”

P.S. any comments that are made by the author of this blog are outlined in Italics.

Productivity Readers

Search This Blog

The Ride of a Lifetime - Entire Book Summary

Superintelligence: Paths, Dangers, Strategies - Entire Book Summary

Comments

Post a Comment