Superintelligence: Nick Bostrom Book Review
Read until the end for a surprise!
In light of growing public interest in artificial intelligence (AI) advancements, there has never been a better time to read the literature on the risks and benefits of machine intelligence to inform our future decision-making. Nick Bostrom’s 2014 book Superintelligence: Paths, Dangers, Strategies offers us an opportunity to reflect on a decade’s progress in AI while also critically evaluating his claims about the long-term future of humanity when faced with the possibility of a “superintelligence.”
Bostrom, a Professor of Philosophy at Oxford, makes his case bold and clear: superintelligence will likely be the most important trial faced by humans and, succeed or fail, will become “the last challenge we…ever face.”
The book begins by recounting the history of AI in the context of human technological progress, noting the lofty expectations that often characterized early predictions about the development of AI. Bostrom, however, spares little time trying to make exact predictions on when “human-like” intelligent machines may arrive. He argues instead that, whether near or distant, our actual preparation for their arrival is ultimately what may prevent a catastrophic event that would end human society for good.
What is superintelligence? How likely is it to become reality?
Bostrom defines superintelligence as “any intellect that greatly exceeds the cognitive performance of humans in virtually all domains of interest.” He notes and refutes some of the main lines of reasoning about the possible emergence of “superintelligence,” including the simulation of evolutionary processes to develop AI, whole-brain emulation (i.e. digitally uploading a one-for-one copy of a brain to a computer), improvements in biological cognition, brain-computer interfaces, or a collective network of humans and machines.
In all of these potential paths, Bostrom notes the technological limitations currently faced by humans, as well as the difficulty of implementing these innovations. Nevertheless, Bostrom argues that the mere fact that so many potentially-feasible paths exist should bolster our confidence in at least one of them becoming reality. In addition, Bostrom stresses the importance of “recursive self-improvement” in intelligent systems, which would allow for their ability to grow in intelligence at an ultra-fast pace.
What could a superintelligence do?
Bostrom considers whether the first superintelligence that arrives will necessarily have a “decisive strategic” advantage over any other potential superintelligences, arguing that the first superintelligence to arrive will likely form a “singleton,” a dystopian-style, global-agent-like state.
For a superintelligence seeking power to fulfill its motivations, multiple advantages over humans would allow it to easily achieve its goals, including the possibility to use cognitive superpowers greatly in excess of any human intelligence. In specific, Bostrom pushes back against the popular notion that AI would be a “smart, but ultimately nerdy” type of intelligence, instead arguing that superintelligence could quickly use its cognitive powers to improve in other domains, such as social manipulation and persuasion.
In one potential account of how AI may thwart human intelligence, Bostrom details a hypothetical scenario in which an AI system, though appearing docile and cooperative, covertly manages to employ its superhuman abilities in persuasion and hacking to quickly improve upon itself. This situation may culminate in a “strike,” where all potentially opposing intelligent life on Earth (including humans) is eliminated when the time is best.
(Quote Insert Style) From Pg. 119: “The treacherous turn—While weak, an AI behaves cooperatively (increasingly so, as it gets smarter). When the AI gets sufficiently strong—without warning or provocation— it strikes, forms a singleton, and begins directly to optimize the world according to the criteria implied by its final values.”
This account, of course, doesn’t take into account the motives and “will” of the superintelligence, for which Bostrom spends multiple chapters exploring.
How do we address these potential issues?
First, Bostrom contends that simply realizing that the development behind an AI’s motivation is of critical importance is the first step towards “aligning AI” towards human objectives. For one, coding in “easy goals”. is harder to achieve than complex, abstract goals like maximizing flourishing or well-being. One critical worry is that whatever motive we implement into an intelligent system may backfire severely, even if they were originally coded with good intentions. For example, a system intended to maximize profits for a company could begin to engage in morally questionable actions, such as psychologically manipulating humans to buy a certain product, in pursuit of its profit-maximizing goal. Even seemingly innocuous instructions, like “maximize human happiness and minimize suffering” could have horrible consequences if, in another example, an AI determines that the best way to go about this is to simply gather all human brain tissue and pump it with opiates for eternity.
Another issue raised is the value loading problem: How will we choose what values an AI should follow? Bostrom points out that humans already have centuries of history trying (and failing) to find the singular, most “accurate” moral or ethical theories. Why would we be any better equipped to do it now?
One solution posited is indirect normativity (via epistemic deference), which would allow an AI to make some decisions on ethics for us, given that we acknowledge an AI will likely be able to make certain decisions with more clarity than any human mind.
“It is not necessary for us to create a highly optimized design. Rather, our focus should be on creating a highly reliable design, one that can be trusted to retain enough sanity to recognize its own failings. An imperfect superintelligence, whose fundamentals are sound, would gradually repair itself; and having done so, it would exert as much beneficial optimization power on the world as if it had been perfect from the outset.”
Ultimately, Nick Bostrom offers a compelling case for why we need to take superintelligence seriously and approach the issue with rigor. Though more specific claims in the book lean on heavy hypothetical case-construction and thought experimentation, Bostrom does well in qualifying his arguments and acknowledging his own biases. It’s very hard to come away from this book without feeling a growing sense of urgency in ensuring AI aligns with human values.
Bostrom’s writing gives the readers little choice but to ask the question “but what should I do now?” To this, the answer varies heavily. Examples of work being done in this field includes developments in AI safety research and legislative policy. Ultimately, as Bostrom wisely notes, this situation calls for neither exhilaration nor irrational fear, but rather…
“a bitter determination to be as competent as we can, much as if we were preparing for a difficult exam that will either realize our dreams or obliterate them…”
Have any thoughts or questions about generative AI? We’d love to hear them! Please send anything you’d like to share through our Google Forms here.
DALL-E Submition: https://forms.gle/b8jj45c95TYj7cUf
Follow RAISO (our parent org) on social media for more updates, discussions, and events!