Magazine Winter issue

Winter Magazine Images

Magazine

Winter issue

Relocate Global 20th Anniversary icon

Winter Magazine Images

Magazine

Winter issue

Winter Magazine Images

Embracing critical thinking and privacy in the age of AI: Why questioning AI and protecting data matters

by | Dec 10, 2024

While AI has been a prominent emerging technology for at least a decade, the speed of progress in the past few years has been remarkable – and possibly unsettling. Generative AI, most notably in the form of Large Language Models (LLMs) such as ChatGPT has gone viral – the first general purpose AI technology to reach a global audience. Michael Wooldridge, Professor of Computer Science, University of Oxford and Director of Foundational AI Research, The Alan Turing Institute, appearing as keynote speaker at the CIPD conference in Manchester, tackled key questions around recent trends in AI – where did they come from, what do they mean, and where are they going?

As AI systems integrate more deeply into everyday life and complex decision-making, there is a need to cultivate a balanced, critically aware approach to these technologies, he argued. Since every one of us interacts with AI in their daily lives—from chatbots to recommendation algorithms, and from computer software and wearable technology such as Apple watches – we need to be aware that all our interactions form “training data” for AI systems.

He told the audience that the key to balancing technology with the human touch is learning to critically question AI’s outputs, actively engage in “prompt engineering” to get better results from AI and ensure that we treat personal data with the importance it deserves.

How and why AI has exploded in 2024

Professor Wooldridge explained how AI has been around since the 1960s but its development has only begun to gather pace in the last decade. This is because it took the advent of massive computing power and an understanding of how to train AI in order for it to progress.

“The key point in this century was that we learned how to be able to train neural networks,” he said, explaining why AI has been going through a recent period of rapid expansion. “When we train a neural network, is we show it an input, and we show it the desired output. We adjust the network so that it is producing an answer which is closer to the answer that we want. That is how we train it.”

Yet these systems, at their core, are limited by the data they learn from and the structure of their algorithms. Blindly trusting AI can lead to unintended consequences, whether in over-relying on automated decisions or in inadvertently surrendering control over personal information.

He cited the example of having checked what AI said about his own academic background. GPT-3, the predecessor to ChatGPT and the original programme created by OpenAI, claimed he had studied at Cambridge, something he had never done, and which does not appear in any information about him or in any CV.

 

“In 2020 I’m playing with GPT-3 and I say “Tell me about Michael Wooldridge”. It says “Michael Wooldridge is a professor at the University of Oxford known for his research in artificial intelligence and studied at the University of Cambridge”. I didn’t study at Cambridge. I’ve never had any affiliation with Cambridge whatsoever. Never worked there  and I wasn’t a student there. Remember the way it works. It’s not designed to tell you the truth. It’s designed to take the most plausible thing. It’s read probably thousands of biographies of Oxford professors, and studying at Oxford or Cambridge is very, very common. So in the absence of any information, it’s filling in the gap with what it thinks is the most plausible. And it’s very plausible. If you read that, you wouldn’t have batted an eyelid. You would have taken it for granted.”

He explained that the programmes “lie convincingly a lot. They don’t know what the truth is, but the fact that they lie plausibly makes those lies particularly problematic in terms of bias and toxicity.”

AI is not a super brain: Questioning its outputs

For this reason, Professor Wooldridge says one of the most critical skills in the digital age is learning to treat AI with a healthy dose of scepticism. AI systems are designed to predict, optimise, and assist, but they aren’t infallible or omniscient. They operate on patterns derived from data, and this fundamental nature of AI means that anyone using it needs to be aware of its shortcomings. This is particularly true in cases where accuracy is of vital importance.

The cites the example of the Harm Assessment Reduction Tool (HART) introduced by the Durham Police Force. This AI system was created to assist police in making decisions about whether a detainee should be held overnight in custody, based on whether there was a risk of reoffending or self-harm. The system used historical data to predict whether there was a risk in releasing the person. He argues that despite careful design and oversight, there is a real danger that over time, people may defer unquestioningly to the system’s outputs. For example, a future officer might simply ask, “What does the AI say?” without critically considering additional context or seeking second opinions.

“My worry is that in 10 years’ time, people don’t question it,” he says. “So we keep them in the cell. You’ve got to argue with (AI). You’ve got to think of a reason to argue with it, and we find that tiring. I think this is probably the single most important skill, is not treating the AI as if it is some kind of super brain that’s guaranteed to give you the right answers, because it’s not.”

The blending of AI and reality – what it means for regulation

Professor Wooldridge predicts that with one or two decades, almost everything we read on the World Wide Web will be AI generated, something which he describes as “unsettling”.

“At that point, we simply won’t know whether we’re reading something or encountering something which is produced by human or an AI, and that’s going to be a very unsettling time that seems almost inevitable,” he says.

“You can already see the signs of this, but by and large, poor quality AI generated content is  going to get better, and it is inevitable. Who controls the technology? Well, the answer is, at the moment, a tiny number of extraordinarily wealthy US companies and two state level actors, US and China, as the UK doesn’t have the wherewithal to build a sovereign AI capability because it is too expensive and too risky. The cost of these models at the moment is approaching half a billion dollars each, and they’re difficult and unpredictable to build. They have a lifetime of around about 18 months. So this is a real concern.”

Prompt engineering: Shaping effective AI responses

As large language models like ChatGPT have demonstrated, the way a question is framed can significantly influence the quality of AI-generated responses. This concept, often referred to as prompt engineering, has become an essential skill in using AI systems.

Prompt engineering highlights an unusual and unplanned for sensitivity in these models, Professor Wooldridge says. A simple request to “think carefully” or an alternative phrasing can lead to notably different responses. While the model isn’t “thinking” in the human sense, the structure of the prompt affects the neural network’s output, often producing more relevant or accurate answers.

In turn, understanding how to structure these prompts in the best way can be helpful. This skill is becoming increasingly valuable in areas like customer service, content creation, and research, where fine-tuning responses can shape a more effective interaction.

The privacy imperative and the dangers of data misuse

The rise of AI-powered platforms has brought with it an insatiable demand for data. Existing programmes like ChatGPT, which are used for generative AI, have been trained on the whole content of the internet, including novels, reports, Reddit threads and images. Professor Wooldridge says personal information is collected, stored, and used in myriad ways, often beyond the immediate awareness of users. Large tech companies want more data, because that is how they can better train their AI models in the future.

“You go to every web page on the World Wide Web, you scrape all of the ordinary text, just the text, then you follow all the links in that web page. If you do that exhaustively, until you’ve captured the whole of the World Wide Web, everything, every advertising brochure, every company policy, every scientific paper, every bit of Reddit and Twitter, of Facebook, is there. The entirety of Wikipedia was just three per cent of that content and that shows the astronomical quantities of data collected,” he says.

The new problem for tech companies is that in order for AI to keep learning and improving in sophistication, it requires new data to be trained on.

“If you’ve already used all of the data on the worldwide web, where do you get more data from? This is not a problem for us, but it is a problem Microsoft and Google,” he says. “They’ve already used up the obvious resources of data, and it turns out that using AI generated data leads to really poor quality AI, so you can’t train AI generated outputs. Why should you be concerned about that? Because they really want your data, the data that we provide and that we generate.”

While AI can provide incredible benefits, such as screening large numbers of scans for suspected cancer, there is also a darker side to data collection. All of us often unknowingly consent to invasive data collection, especially through online features like cookies. Professor Wooldridge says that rejecting all cookies may be tedious, but it is an essential step toward preserving privacy. Once data is transferred to a third-party server, control over its use is significantly diminished, regardless of any protections or policies the company claims to have. In many cases, this data goes on to train machine learning algorithms, which can lead to models that reveal insights about individuals’ habits and preferences.

“This is an Apple watch, and I find it very useful, but nevertheless, it’s continually gathering data about me,” he says. “You want to check the settings to make sure it’s not being uploaded to a cloud somewhere in Silicon Valley. Once you hand over your private data, you’ve effectively lost control. It doesn’t matter what covenants are being placed on that data, if it’s on a Silicon Valley server you no longer have control. Really, you should reject all cookies, otherwise it’s going to be gathering data, and that data is going to go into machine learning algorithms which are going to learn very uncomfortable personal things.”

What does the future hold for AI?

Given that Professor Wooldridge predicts that AI generated content and information will be ubiquitous within a ten to twenty years, there is a question around protection of data and regulation.

“There are three models of AI regulation in the world,” he says. “There’s the EU regulation, which is a GDPR type regulation designed to protect human rights. It’s very aspirational. It remains to be seen how successful it is, but nevertheless, it’s quite high standard for protecting personal data and so on, just like GDPR does.

“Then there is US regulation, which is about protecting innovation and companies. Then there is Chinese legislation, which is broadly speaking about protecting the institutions of the Chinese state. The EU regulation model is by far the most elaborate and is about protection particularly in high risk scenarios where AI has the potential to actually do some harm. But I have to say, the jury out on how successful that’s going to be.”

As AI systems grow in sophistication and autonomy, our responsibility as users and developers is to engage with these systems actively and thoughtfully, he says. Regulation is a key question for governments and nation states, and data privacy is a big issue for individuals and companies. While AI offers many opportunities, it can also be biased, inaccurate and misleading. We all need to be aware of its dangers as well as the benefits it can bring.