Can AI chatbots be stopped from going rogue?

Guest Contributor
February 28, 2024

By Eli Fathi and Peter MacKinnon

Eli Fathi (photo at left) C.M. is Chair of MindBridge Analytics, based in Ottawa. Peter K. MacKinnon (photo at right) is Senior Research Associate, Engineering, at the University of Ottawa and Member of the IEEE-USA Artificial Intelligence Policy Committee, Washington. D.C. 

Note: The opinions expressed here are those of the authors and don’t necessarily reflect the views of the organizations they are associated with. 

Large Language Models (LLMs) such as ChatGPT and tagged image datasets such as Midjourney use generative AI to create high-quality text, images, videos and other content primarily based on vast quantities of training data scraped from the internet.

It is known empirically and historically that some humans at times, under financial stress, career opportunities or ideologically driven, are prone to act outside the laws, values and long-standing traditions of society. Furthermore, the integrity of the training data used in LLMs may not always be guaranteed; hence the issue arises of using potentially infected data to train the model, which would impact its operational integrity.

Given that humans are developing these LLM models and image data sets, we can expect that by omission as well as with intent some models will have a “back door” to allow them to go rogue. The real question is what happens when the models do not behave as a user intended or is deceived in achieving that intent and who is responsible?

This is of concern because these models can be used to create a wide range of fake and fraudulent outputs designed to deceive, coerce, bully, rob and many other types of nefarious activities. A case in point is the recent AI-generated Taylor Swift and U.S. President Joe Biden deepfake videos posted anonymously to the internet. Rogue models could create election interference, degrade critical infrastructure operations, and impact the daily lives of millions of people.

It is not possible to just “pull the plug” on a rogue LLM model, as it can create many copies of itself across the internet with LLM apps accessible from billions of smartphones from anywhere in the world. Clearly, companies that develop these models must use guardrails voluntarily or be legislated to ensure adherence to a set of acceptable standards, yet those standards don’t exist yet. 

Commercial LLM models have some built-in safety measures (e.g. guardrails) to control the interaction and use of the model and ensure adherence to the expected design and performance outcomes. This is akin to air traffic controllers ensuring smooth operations and detecting unusual situations.

Another problem is how to deal with LLM model’s “hallucinations” that can occur in two ways and may deceive users. First, in the absence of available training data, the model can make up information (e.g. creating fake legal case law). Second, it can co-relate unrelated pieces of data associated with different situations and try to relate them even though there is no relevancy or a link to the given problem to be solved. This can lead users into bizarre conversations with a chatbot and the possibility of doing self-harm or harm to others.

Furthermore, it appears that due to the vulnerability of LLM models being gamed, apps that are powered by LLM models are susceptible to being “hypnotized” to generate improper or inappropriate information outside what should have been the intended outcomes. This will put unsuspecting end users at a major disadvantage by accepting results without challenging their logic, which in turn could jeopardize the end user’s intent.  

Legal implications of using LLM models

There are also legal ramifications associated with using LLM models, as much of the training material is gathered using a “boil the ocean” philosophy to scrape the internet for training data. This means that information is being used without prior consent of the content creators of the data and images and associated rightful owners of such material. For example, the New York Times newspaper recently launched a lawsuit against OpenAI, maker of ChatGPT, for infringement of its website content used to train the chatbot. 

In addition, training data that are not vetted could be biased, flawed, inaccurate or infected and consequently could be propagated into the model. This could result in corruption of the model as well as possible infringements within the rules of data privacy and digital security. This problem will be further exacerbated as chatbots such as ChatGPT are adding digital memory features to allow the bots to remember and recall past information about their users.  

As noted, there is the question of who bears the responsibility for these problems and who is responsible to ensure ethical uses of the models. Specifically, does responsibility lie with the model’s creators, the application developers, the enterprises using them, and/or the end user? In other words, who is responsible for the legal ramifications about privacy and intellectual property rights?

The impact of using apps based on chatbots will force society to find new ways to redefine what has been acceptable as ethical for centuries and how to behave in the new world of using chatbots.

For example, the impact on the field of education has already started. Should we be concerned with plagiarism where professors are now having to police whether a student’s paper or parts thereof was written by the individual or were assisted with generative AI (GAI), consequently distracting the instructor from focusing on the task of teaching. Another example is in the marketing area where the GAI-assisted tool will develop a set of marketing materials that may infringe of copyrighted images and material associated with intellectual property rights owners. 

The key concern is the “dark side” of AI, which has the broader potential to hinder societies’ acceptance of AI as a good thing, thus impacting the technology’s potential to flourish. There is a balancing act between taking advantage of the benefits of using GAI while weighing the risks associated with using GAI as well as the risks of not using it. 

At this time, it may not be possible to have a hermetic solution to preventing AI chatbots from going rogue. Consequently, society will have to find ways to coexist with the downsides while benefiting from the upsides while continuing to search for better solutions. The Luddites may return – but today, in yet unknown ways. 


Other News

Events For Leaders in
Science, Tech, Innovation, and Policy

Discuss and learn from those in the know at our virtual and in-person events.

See Upcoming Events

Don't miss out - start your free trial today.

Start your FREE trial    Already a member? Log in


By using this website, you agree to our use of cookies. We use cookies to provide you with a great experience and to help our website run effectively in accordance with our Privacy Policy and Terms of Service.