- Get link
- X
- Other Apps
- Get link
- X
- Other Apps
The control problem
Here we analyse whether humans have any countermeasures to a potential existential catastrophe.
Two agency problems
This is a relationship that occurs when a human (‘the principal’) assigns another (‘the agent’) to act in the former’s best interests. The first scenario occurs frequently in business where a human assigns another human to act as his/hers agent. The second, and less frequent, is when a human assigns a superintelligent system to act as his/hers agent. The danger with superintelligence is that eventually the superintelligent agent may stop acting in the principal’s best interests.
Capability control methods
Furthermore, Bostrom emphasised that it is vital that some control methods are exerted proactively, prior to the system attaining superintelligence.
Control methods include placing the superintelligent system into “an environment in which it is unable to cause harm (boxing methods) or in which there are strongly convergent instrumental reasons not to engage in harmful behavior (incentive methods).” Another possibility is reducing the internal capabilities of the system (stunting). Moreover, “capability control methods might involve the use of mechanisms to automatically detect and react to various kinds of containment failure or attempted transgression (tripwires).”
Additionally, it could also be beneficial for further security, that the system is established in a metal mesh to not allow it to transmit radio signals.
Correspondingly, the superintelligent system should be surrounded by surveillance in order to be monitored and evaluated, “either manually or by some automated process.” The superintelligent system would know that a positive evaluation would bring about a reward or an extra incentive and that a negative evaluation would bring about a shutdown.
Not to mention, another reasonably adequate control method is using detectors around a boxed AI to detect efforts to breach the containment. For instance, detectors could prevent any efforts of the AI system to communicate via radio signals or utilising internal computational resources that are not permitted. Again, any attempts that violate any principles should bring the system to a shutdown.
Motivation selection methods
Motivation selection methods are used to not allow unfriendly outputs by the superintelligent agent by forming how the agent intends to act.
These methods could include explicitly shaping a final goal or set of principles to be adhered (direct specification) or designing the system to be able to identify adequate values for itself by relying on certain implicit or indirect requirements (indirect normativity). Additionally, the system could be created in a way that it would have modest, non-ambitious goals (domesticity). Furthermore, “an alternative to creating a motivation system from scratch is to select an agent that already has an acceptable motivation system and then augment that agent’s cognitive powers to make it superintelligent, while ensuring that the motivation system does not get corrupted in the process (augmentation)”
An interesting notion unfolds subsequently in the Chapter as Bostrom cites the ‘three laws of robotics’ pointed out by science fiction author Isaac Asimov in a story published in 1942. These should be used to shape the creation of AI.
“The three laws were:
(1) A robot may not injure a human being or, through inaction, allow a human being to come to harm;
(2) A robot must obey any orders given to it by human beings, except where such orders would conflict with the First Law;
(3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.”
Comments
Post a Comment