How ethical QA practices could prevent our HAL 9000 moment.
In the race to protect us against rogue AI, our best defence might not be scientists or politicians, but the often-overlooked heroes of the tech world: software testers. As AI systems increasingly mediate healthcare, criminal justice, and military decisions, this unlikely profession could hold the key to preventing existential catastrophe.
You might think this is far-fetched, but we’re at an inflection point in society and the world is poised to change more dramatically than ever.
I doubt that people living at the turn of the 19th Century had any concept of what would come over the next hundred years. And while the industrial revolution was indeed a biggy, the coming AI revolution will be more significant, impactful, and potentially more dangerous.
Just as the Industrial Revolution brought unforeseen challenges that required new safety measures, the AI revolution will demand unprecedented ethical safeguards to protect humanity.
Where engineers came to dominate the Industrial Revolution and protect people against the dangers of machines, software engineers—and testers in particular—will be uniquely positioned to safeguard humanity as AI takes greater control of our daily lives.
The Potential Threats of Rogue AI
The potential threats posed by AI are multifaceted; ranging from digital risks like large-scale cyberattacks and fraud, to societal and political risks such as the proliferation of synthetic media and deepfakes that could erode public trust and manipulate populations.
Physical risks loom as AI becomes embedded in critical infrastructure, while economic disruption through job displacement and the development of autonomous weapons raise serious ethical and security concerns.
Perhaps most alarming are the long-term existential risks posed by advanced AI systems. Some researchers warn that AI could act unpredictably or pursue goals misaligned with human values.
As AI capabilities advance rapidly, the need for ethical considerations and safeguards becomes increasingly urgent to ensure that AI development benefits humanity without inadvertently leading to catastrophic outcomes.
Current P1 Incidents Will Seem Trivial
Software defects are already a significant issue for many, but as far as I’m aware, we’ve not seen any that have come close to extinction-level events.
We’ve seen countless banks, retailers, space agencies, and game developers lose millions, if not billions, of pounds because of dodgy code. According to a report from CISQ, “For the year 2020, we determined the total Cost of Poor Software Quality (CPSQ) in the US is $2.08 trillion”.
While critical in a software sense, these costly live issues will pale into significance compared to defective AI solutions that could legitimately lead to the end of the world as we know it, because of lack of adequate controls for the AI in the decisions it can make.
The Limits of Traditional Testing
While traditional software testing ensures systems function as intended, AI introduces unpredictable variables that demand a paradigm shift. Current practices focus narrowly on validating predefined rules, leaving dangerous blind spots when applied to self-learning systems.
Generally speaking, QA teams currently focus on verifying explicit requirements:
- Functional compliance with specifications
- Performance under expected conditions
- Technical bug identification and reporting
However, this approach is not enough for AI systems. AI’s capacity for emergent behaviour requires testers to expand from technical validators to ethical auditors.
The differences between traditional software testing and AI testing requirements can be thought of like this:
Traditional Testing | AI Testing |
Predefined inputs and outputs | Dynamic, evolving responses |
Deterministic behaviour | Probabilistic outcomes |
Focus on functionality | Focus on ethics and safety |
Bug-oriented | Bias and misalignment-oriented |
The Importance of Ethical AI Testing
This shift from deterministic to probabilistic outcomes means testers must anticipate a broader range of possible behaviours, including those that may emerge unexpectedly.
The ‘HAL 9000 moment‘ mentioned in the subtitle refers to the fictional AI in ‘2001: A Space Odyssey’ that turns against its human crew to protect the mission. While only a story, this is often cited as an example of the potential dangers of advanced AI systems that follow poorly defined success criteria.
When it comes to AI, testers must employ Ethical QA that goes beyond functionality testing to assess an AI’s decision-making process, potential biases, and alignment with human values and ensure there are adequate safeguards.
After all, Don’t kill astronauts wasn’t in HAL’s spec sheet.
Bridging the Imagination Gap
Testers face a fundamental challenge: you can’t write test cases for scenarios nobody anticipated. This gap between human foresight and machine creativity demands systematic imagination.
Testers must now assess whether an AI works and whether its decisions are ethically sound and socially beneficial.
As mentioned in the intro, roles are changing, and testers must evolve to fit this new position. To fulfil their role as AI Guardians, testers will need a blend of technical expertise, ethical understanding, and domain knowledge in psychology, sociology and related areas.
I would argue that Ethical AI software testers should also:
- Advocate for ethical requirements in early development stages
- Treat sci-fi scenarios like 2001’s HAL rebellion as legitimate test cases
- Push for governance frameworks ensuring human accountability
You might think that we are way off having to worry about this but are we. The AI we have access to will be generations behind what is being developed. Therefore, how do we know what it can do and the conclusions it may draw.
For instance, if you search for “what is causing the climate crisis” The AI generated answer that Google AI comes up with is:
If you were then to ask AI “how to prevent climate change” how long before it could come up with the scenario that restricting or removing human activity is the answer. We are now in a Terminator type scenario.
I get that many reading this may see this as farfetched, but is it? Many tech leaders and governments are advocating for safeguards to be built in.
Guardians of the Code, and Mankind’s Defence Against The Machines
As AI systems grow more autonomous, software testers must become vital counterweights against accidental catastrophes and deliberate misuse. Their propensity to ask “What if?” may determine whether technology elevates or destroys humanity.
The existential stakes transform testers from validators to custodians. Their new toolkit – blending sci-fi imagination with rigorous testing protocols – makes them the immune system for our AI-dependent civilization.
When the next HAL-like system inevitably emerges, it won’t be stopped by philosophers or politicians, but by a tester who noticed the ethical equivalent of a missing semicolon.