Programming Morality into Machines

by Pieter Klijnsma

June 30, 2023

In his 1950 work I, Robot¹, Isaac Asimov states the Three Laws of Robotics to be:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence if such protection does not conflict with the First or Second Law.

Though seemingly sensible laws upon first viewing, Asimov uses his book to demonstrate how these laws alone do not suffice to avoid humanitarian catastrophe. He demonstrates the murkiness of morality and thus its incompatibility with the black and white world of rationalism and logic. This should come as no surprise. Absolutist moral principles like Asimov’s three laws of robotics have never had much success at avoiding moral contradictions. Take Immanuel Kant’s categorical imperative², which states that an act is immoral if, when imagining everyone else carried out that act, you would not will it. Lying according to Kant then is categorically immoral, as it is not a “universalizable” action. However scenarios intuitively seem to exist in which lying would be moral. For example, lying about the whereabouts of your friend to a bloodthirsty murderer looking for your friend to kill for their own pleasure. Or take Bentham’s sovereign masters of pleasure and pain³, which follows that morality is a matter of maximising the former and minimising the latter. If there are ten sadistic guards looking to hurt one prisoner for their own pleasure, utilitarian principles suggest that it would be immoral to protect the prisoner from these sadistic guards.

It is well understood in all belief systems that there is no simple step by step guide to morality. There are grey areas, caveats, exceptions and so on. Morality is a never-ending game of mental gymnastics. Aristotle understood moral decision-making to be a virtue like any other, and therefore needs to be practised repeatedly. Morality requires wisdom, intuition, and common sense. These are faculties that machines, unlike humans, lack, which would allow them to override cases like the bloodthirsty murder or the sadistic guards. Machines have no ability to make moral inferences beyond exactly that which they have been told to do.

This is well demonstrated by a hypothetical anecdote. Imagine that without your knowledge, the nearest coffee shop to your house has decided to hike up its coffee prices so a coffee now costs one million dollars. Imagine you ask your AI robot assistant to go out and get you a coffee from your local coffee shop. Under the instruction to get you a coffee, the robot buys you a million dollar cup of coffee. Due to the robot’s lack of common sense, it could not make the obvious assumption that you would much rather keep the one million dollars and not have a coffee.

This hypothetical example demonstrates how complex it can be to have an entity behave how you wish without common sense, and thus demonstrates the complexity of programming morality into a machine.

While absolutist principles offer little help in programming morality into machines, this is not to say programming morality in machines is not possible. Autonomous vehicles (AVs) make moral decisions all the time independent of direct real time input from a human. They avoid crashes with other cars, hitting pedestrians and other animals, and all with as much success if not more than a human driver.

Unlike the unyielding Three Laws of Robotics, this can be achieved by training them on a case-by-case basis. While this method is better suited to the prickly and inconsistent world of morality, it pales in comparison to its absolutist cousin in unforeseen circumstances. AI models do well in situations with a limited number of variables, but the number of variables to account for while driving are numerous and difficult to enumerate. A driving scenario, like overtaking a slow-moving car, could be the safest option in one weather condition (when it is dry), but could be catastrophic in another (when it is raining). To account for these niche scenarios that no reasonable amount of pre-training could, AV companies like Tesla use data from their customers’ real-life driving scenarios to feed to the neural network and better train the automatic driving system.

This effective but morally contentious way of training these systems had led to much moral debate around the deployment of these cars. This trade-off involving reduced human safety in the short term to sooner increase road safety has become an open question, which has sparked a debates such as that between Tesla and software creator and politician Dan O’Dowd, which remains ongoing⁴.

It seems then that while programming morality into machines can be done effectively, doing so unavoidably raises moral difficulties.

The challenge of programming morality only grows when the line between right and wrong is less clearly defined. With AVs, it is usually obvious what is right and wrong. The right action is the one which best ensures the safety of the driver, other drivers, and pedestrians on the road.

People like to imagine a variety of trolley problem scenarios in Avs. Although this is a thought-provoking pursuit, such situations are not worth spending much time pondering, since they are incredibly unlikely to arise. It would seem foolish that just because there isn’t a pre-programmed answer to whether an AV should hit five elderly women or two young children, that should halt the deployment of autonomous driving vehicles, which have far surpassed human drivers in driving safety tests and whose deployment will only more quickly reduce road accidents.

Apart from extremely rare and peculiar events that a human driver would themselves be no better at navigating, what is right and wrong in the case of AV programming is usually obvious.

The real concern arises when dealing with a technology where the moral judgements needed to be made are less straightforward. As AI continues to interlace itself in society and grow in complexity, there is a push to build artificial moral agents (AMAs), which can help with government policy decisions in an AI/human hybrid world. Important questions ensue. Who do we let program morality into these AI agents? How can we trust the intentions of the programmers and consequently of the AMAs? Are we instilling in them a dangerous level of consciousness? At what point could we lose control of our autonomy over these entities? I cannot begin to address these open questions now, as I would fail to do them justice. However they remain important questions to consider as we continue to propel full throttle into this brave new world.

References

Asimov, Isaac. “I, Robot.” (1950).
Kant, Immanuel, and Jerome B. Schneewind. Groundwork for the Metaphysics of Morals. Yale University Press (2002).
Bentham, Jeremy. The collected works of Jeremy Bentham: An introduction to the principles of morals and legislation. Clarendon Press (1996).
Ross, Phillip. “Billionaire Brings Tesla Autopilot Rebuke: Outspoken critic delivers sobriety check on EV maker’s Full Self-Driving mode.” IEEE Spectrum (2023).

Pieter Klijnsma

Pieter Klijnsma (Contr Fellow Editorial Team) is a recent graduate of the University of Bristol in England where he studied Philosophy & French. At university he was able to marry his passion for Ethics with AI, primarily considering the possibility and consequences of what have been considered to be human-only faculties such as intelligence, semantic comprehension and consciousness, existing in non-human entities. His final thesis concerned the presence of semantic meaning in contemporary LLMs. Pieter hopes to continue exploring these issues as an editor with AI&Faith and to soon begin a graduate program in AI Ethics.

Designing ‘Waldo,’ Our Unitarian Universalist Robot

Two New AI&F Collaborators: Museum of the Bible and GLOO

Programming Morality into Machines

About Us

Content