Human Compatible: Artificial Intelligence and the Problem of Control is non-fiction book by the British computer scientist Stuart Russell which was published in 2019. Along with AI control problem he states that the risk to humanity from advanced artificial intelligence (AI) is a serious concern despite the uncertainty surrounding future progress in AI.
I have felt like Human Compatible is a philosophical read to know AI, or knowing the reality of AI through a philosophical perspective. Russell discusses the great philosophical and scientific minds of the time to make sense of AI compatibility in a human perspective.
It is one of the 10 books that I have been reviewing, e.g, The Age of AI: And Our Human Future, Superintelligence: Paths, Dangers, Strategies, Human Compatible, 2084: Artificial Intelligence The Future of Humanity, Our Final Invention, The Singularity Is Nearer, Four Battlegrounds, The Alignment Problem, Artificial Intelligence: A Guide For Thinking Humans and Life 3.0.
Timeline
Russell does not want to predict the arrival of superintelligent AI for, firstly, such prediction in history went wrong. For example, he quotes AI pioneer and Nobel Prize-winning economist Herbert Simon who said in 1960 that “Technologically . . . machines will be capable, within twenty years, of doing any work a man can do”, while in 1967, a co-organizer of the 1956 Dartmouth workshop that started the field of AI, Marvin Minsky, wrote, “Within a generation, I am convinced, few compartments of
intellect will remain outside the machine’s realm—the problem of creating ‘artificial intelligence’ will be substantially solved”. These predictions to date remained unfulfilled.
A second reason why Russell won’t predict the arrival of superintelligent AI is that there is no clear threshold that will be crossed. Machines already exceed human capabilities in some areas, he says. According to Russell, “Those areas will broaden and deepen, and it is likely that there will be superhuman general knowledge systems, superhuman biomedical research systems, superhuman dexterous and agile robots, superhuman corporate planning systems, and so on well before we have a completely general superintelligent AI system.”
A third reason why he is unwilling to give any timeline is that predicting the arrival of superintelligent AI is inherently unpredictable.
It requires “conceptual breakthroughs,” as noted by John McCarthy in a 1977 interview. McCarthy went on to say, “What you want is 1.7 Einsteins and 0.3 of the Manhattan Project, and you want the Einsteins first. I believe it’ll take five to 500 years.” Just how unpredictable are they? Probably as unpredictable as Szilard’s invention of the nuclear chain reaction a few hours after Rutherford’s declaration that it was completely impossible.
Here are some important personal takeaways, centered around the core challenges and solutions Russell presents:
The Problem of Control
Russell argues that AI, especially superintelligent systems, could become uncontrollable if their objectives do not align with human values. The “standard model” of AI, where machines optimize fixed objectives, is problematic because if those objectives are misaligned, machines might pursue harmful outcomes. This is a serious concern as the gap between human intelligence and AI increases.
From a personal perspective, this is a critical point because it touches on the inherent risk in developing AI systems that could surpass human control. The metaphor of a machine pursuing its goals unchecked is a strong reminder of the potential hazards of advancing AI without addressing this fundamental issue.
Considering the cost of what humans wish to do with their intelligence Russell says, “All this should come as no great surprise. For thousands of years, we have known the perils of getting exactly what you wish for. In every story where someone is granted three wishes, the third wish is always to undo the first two wishes”.
Because, “the fact that we are planning to make entities that are far more powerful than humans”, and “How do we ensure that they never, ever have power over us?”
I J Good was mentioned by Russell as to say, “…Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be left far behind. Thus, the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.”
Russell warns the control problem: “The chances are that we would be unprepared: if we built superintelligent machines with any degree of autonomy, we would soon find ourselves unable to control them.”
Human Intelligence vs Machine Intelligence
Russell contrasts how human intelligence evolved in ways that allow for adaptability and learning. In contrast, machine intelligence currently follows a rigid path of achieving predefined objectives.
Russell’s argument for the need to shift towards provably beneficial AI—systems that learn human preferences over time and act in a way that benefits humanity—feels like a necessary philosophical and technical shift.
From a personal perspective, this is a profound idea because it moves the conversation from AI as just a technological tool to AI as a cooperative entity that must align with human values.
With AI tutors, the potential of each child, no matter how poor, can be realized. The cost per child would be negligible, and that child would live a far richer and more productive life. The pursuit of artistic and intellectual endeavors, whether individually or collectively, would be a normal part of life rather than a rarefied luxury.
Misuse of AI
The text emphasizes the potential misuse of AI for political and social manipulation, which is already happening with algorithms that influence social media users for profit. Algorithms that maximize engagement by shaping user preferences have already contributed to political polarization and misinformation.
From a personal view, this is deeply concerning because it shows how even basic, non-superintelligent AI can have significant societal impacts.
The idea that AI systems could manipulate not just our media habits but potentially our worldviews highlight the importance of building systems with accountability and ethical oversight.
Russell mentions the impact of AI in feeding misinformation, disinformation and manipulation and nations’ failure to combat it. He say that “democratic nations, particularly the United States, have for the most part been reluctant—or constitutionally unable—to prevent the imparting of false information on matters of public concern because of justifiable fears regarding government control of speech.
Rather than pursuing the idea that there is no freedom of thought without access to true information, democracies seem to have placed a naïve trust in the idea that the truth will win out in the end, and this trust has left us unprotected”.
A New Model for AI
Russell suggests a new model where AI systems are uncertain about human objectives and therefore seek clarification from humans, allowing humans to retain control.
This proposal is a refreshing take on AI development, stressing humility in design, where the machine’s uncertainty is a feature, not a bug. Machines would ask for permission, accept corrections, and allow themselves to be turned off if necessary.
The danger of developing superintelligence is obvious which Russell realises, “it seems that the march towards superhuman intelligence is unstoppable, but success might be the undoing of the human race.”
This vision is personally compelling because it suggests that future AI systems could maintain a form of deference to human authority, addressing one of the most pressing ethical concerns about runaway AI development.
The danger of AI envisioned by Russell: “Once the practical incentive to pass our civilization on to the next generation disappears, it will be very hard to reverse the process. One trillion years of cumulative learning would, in a real sense, be lost. We would become passengers in a cruise ship run by machines, on a cruise that goes on forever—exactly as envisaged in the film WALL-E”.
Controlling Human Behavior
A passage of Human Compatible discusses the dangers of surveillance technology and AI in manipulating human behaviour.
Once surveillance is established, AI systems can monitor individuals and exploit their actions through methods like automated blackmail, identifying missteps to extract money or enforce political control. AI-driven systems can manipulate behavior by shaping individuals’ information environments, and presenting tailored messages that reinforce specific beliefs and decisions, much like advertisers and propagandists do.
The technology’s sophistication increases as AI learns from real-time feedback, further enhancing its ability to influence behavior.
AI can also generate deepfakes—realistic fake content like videos or audio—which can create false beliefs and damage reputations. AI systems and bot armies can overwhelm online platforms with misinformation, undermining trust and disrupting marketplaces that depend on reputation systems. Governments could use such technologies for direct control, rewarding and punishing behavior to align with state objectives, and turning citizens into subjects of behavioral optimization.
Russell describes the possible future scenario in the following way, “the combination of AI, computer graphics, and speech synthesis is making it possible to generate deepfakes—realistic video and audio content of just about anyone saying or doing just about anything. The technology will require little more than a verbal description of the desired event, making it usable by more or less anyone in the world.
Cell phone video of Senator X accepting a bribe from cocaine dealer Y at shady establishment Z? No problem! This kind of content can induce unshakeable beliefs in things that never happened.
In addition, AI systems can generate millions of false identities—the so called bot armies—that can pump out billions of comments, tweets, and recommendations daily, swamping the efforts of mere humans to exchange truthful information. Online marketplaces such as eBay, Taobao, and Amazon that rely on reputation systems7 to build trust between buyers and sellers are constantly at war with bot armies designed to corrupt the markets.
However, such systems have significant drawbacks. They ignore the psychological costs of living under constant surveillance, eroding genuine human kindness and reducing acts of compassion to self-serving actions. This approach also risks promoting superficial compliance with official goals rather than true improvements, as individuals optimize for the system’s metrics instead of genuine societal contribution. Ultimately, imposing a uniform measure of virtue undermines the diversity of individual contributions that a successful society requires.
Russell imagines a world where AI dominate the humanity. Yes, it is logically possible, Russell puts, that such machines could take over the world and subjugate or eliminate the human race. If that is all one has to go on, then indeed the only plausible response available to us, at the present time, is to attempt to curtail artificial intelligence research—specifically, to ban the development and deployment of general-purpose, human-level AI systems.
The Future of AI
Russell envisions a world where AI can either lead to human extinction or a golden age. The pivotal factor is how we manage its development. The idea that success in creating superintelligent AI could be the last achievement of humanity makes the problem urgent.
Yet Russell posits that “Does this rapid rate of progress (in AI) mean that we are about to be overtaken by machines? No. There are several breakthroughs that have to happen before we have anything resembling machines with superhuman intelligence”. It cannot be certainly said how it would behave.
As machines, unlike humans, have no objectives of their own, we have to be careful about what objectives we put into them to achieve. In other words, we build optimizing machines, we feed objectives into them, and off they go.
Russel quotes Norbert Wiener, a legendary professor at MIT: “If we use, to achieve our purposes, a mechanical agency with whose operation we
cannot interfere effectively . . . we had better be quite sure that the purpose put into the machine is the purpose which we really desire”. If we put the wrong objective into a machine that is more intelligent than us, it will achieve the objective, and we lose.
Russell speculates that “Machines are beneficial to the extent that their actions can be expected to achieve our objectives”.
From a personal standpoint, this feels both exhilarating and terrifying. On one hand, AI could unlock unparalleled advancements, yet on the other hand, mishandling its control mechanisms could result in disaster. This duality speaks to the immense responsibility that lies with AI researchers and developers.
Leaving aside the tribal notion that anyone mentioning risks is “against AI,” both Zuckerberg and Etzioni are arguing that to talk about risks is to ignore the potential benefits of AI or even to negate them.
Technological Automation
Stuart Russell discusses the impact of technology on human and job automation extensively in Human Compatible: Artificial Intelligence and the Problem of Control. His views on the subject highlight both the opportunities and risks automation pose to employment, especially in the context of AI-driven systems.
He sees, “the direct effects of technology work both ways: at first, by increasing productivity, technology can increase employment by reducing the price of an activity and thereby increasing demand; subsequently, further increases in technology mean that fewer and fewer humans are required”.
Historically, most mainstream economists have argued from the “big picture” view: automation increases productivity, so, as a whole, humans are better off, in the sense that we enjoy more goods and services for the same amount of work.
Russell says that as AI progresses, it is certainly possible—perhaps even likely—that within the next few decades essentially all routine physical and mental labor will be done more cheaply by machines.
The scenario of technological unemployment is indeed grim. because, as Russell puts it, “Unfortunately, between 2010 and 2016 about one hundred thousand tellers lost their jobs, and the US Bureau of Labor Statistics (BLS) predicts another forty thousand job losses by 2026: “Online banking and automation technology are expected to continue replacing more job duties that tellers traditionally performed.”
The number per capita of retail cashiers dropped by 5 per cent from 1997 to 2015, and the BLS says, “Advances in technology, such as self-service checkout stands in retail stores and increasing online sales, will continue to limit the need for cashiers.” The same is true of almost all low-skilled occupations that involve working with machines.
White-collar jobs such as insurance underwriters are also at risk as automated underwriting software allows workers to process applications more quickly than before, reducing the need for as many underwriters, and if language technology develops as expected, many sales and customer service jobs will also be vulnerable in the legal profession also.
Indeed, almost anything that can be outsourced is a good candidate for automation, because outsourcing involves decomposing jobs into tasks that can be parceled up and distributed in a decontextualized form. The robot process automation industry produces software tools that achieve exactly this effect for clerical tasks performed online.
As AI progresses, it is certainly possible—perhaps even likely—that within the next few decades essentially all routine physical and mental labor will be done more cheaply by machines.
This is what is going to happen in the near future when human labour will become irrelevant. Russell sees the possibility, “One rapidly emerging picture is that of an economy where far fewer people work because work is unnecessary”.
Key Points on Job Automation:
1. Vulnerable Sectors
Russell points out that many sectors are vulnerable to job automation, particularly low-skilled occupations. Examples include:
Retail cashiers: With self-checkout systems and the rise of online shopping, the demand for cashiers has significantly decreased.
✔ Bank tellers: Automation and online banking have resulted in significant job reductions among bank tellers.
✔ Truck drivers: With the advent of autonomous vehicles, millions of truck driving jobs in the U.S. are at risk as companies like Amazon test self-driving trucks for freight haulage.
Current professions of this kind include psychotherapists, executive coaches, tutors, counselors, companions, and those who care for children and the elderly, will be the viable service.
2. White-Collar Jobs Are Not Exempt
Even traditionally safe, white-collar jobs are becoming vulnerable. For instance:
✔ Insurance underwriting is experiencing automation through the use of automated underwriting software, which can process applications faster than human workers.
✔ Legal professions: AI systems are outperforming human lawyers in specific tasks like analyzing non-disclosure agreements.
3. The Threat of Widespread Job Loss
✔ Russell highlights a broader concern that automation could eventually replace a significant portion of the labor market. With AI advancements, not only are repetitive tasks vulnerable, but even routine mental work could be automated.
This extends to fields like routine programming, data entry, and customer service.
4. Compensation Effects:
Economists have long debated whether the jobs created by automation can compensate for those lost. Russell argues that while some jobs will emerge, the net effect on employment is likely negative. He uses the metaphor of horses replaced by cars—once the internal combustion engine arrived, most horses became redundant.
Likewise, many jobs could disappear, leaving humans struggling to compete for the fewer highly skilled roles that remain.
5. Economic Inequality and Social Implications:
A critical concern Russell raises is the inequality automation might exacerbate. As AI reduces the demand for human labor, wages in many sectors could drop below subsistence levels, similar to what happened to horses with the rise of mechanized transportation.
The concentration of wealth could increase further as the owners of automated systems reap the benefits.
6. The Need for New Economic Models:
Finally, Russell notes that retraining workers for high-tech roles like data science is an inadequate solution. The number of such jobs is limited compared to the scale of potential job losses.
He suggests that societies will need to rethink economic models, perhaps exploring options like universal basic income (UBI), to cope with a future where many traditional jobs no longer exist.
Russel proposes the Keynesian model “Economic Possibilities for Our Grandchildren”, the rapidly emerging picture of an economy where far fewer people work because work is unnecessary. Thus, modern proponents of Keynes’s vision usually support some form of universal basic income or UBI. Funded by value-added taxes or by taxes on income from capital, UBI would provide a reasonable income to every adult, regardless of circumstance.
Those who aspire to a higher standard of living can still work without losing the UBI, while those who do not can spend their time as they see fit”, Russell writes.
Conclusion
In conclusion, Russell offers a sobering view of the future of work in an AI-driven world, urging policymakers to prepare for the significant social and economic disruptions that widespread job automation could cause.
Russell’s book serves as both a warning and a guidepost, encouraging a rethinking of how AI should be designed, controlled, and integrated into society.
The challenge isn’t just about developing smarter machines, but about ensuring that their goals align with human well-being, making them human-compatible. The takeaway here is the necessity of maintaining control and ensuring that AI systems serve humanity, not the other way around.