UK’s Cybersecurity Agency Raises Concerns Over Chatbot Vulnerabilities

cybersecurity

The UK’s National Cyber Security Centre (NCSC) has issued a warning highlighting the potential manipulation of chatbots by hackers, which could result in alarming real-world consequences. The agency has identified the escalating cybersecurity risks associated with “prompt injection” attacks, where users manipulate prompts to induce unintended behavior in the underlying language models powering chatbots.

Chatbots, relying on artificial intelligence, respond to user-provided prompts by simulating human-like conversations. They are commonly used in various applications, such as online banking and shopping, primarily handling straightforward requests.

Prominent among these language models are Large Language Models (LLMs) like OpenAI’s ChatGPT and Google’s AI chatbot Bard. These LLMs are trained using datasets that generate human-like responses based on user input.

As chatbots are often used to pass data to external applications and services, the NCSC has expressed concerns that malicious prompt injections could pose a growing risk.

For example, users could input statements or questions unfamiliar to the language model or utilize specific combinations of words to override the model’s intended behavior. This manipulation might lead to unintended outcomes, such as generating offensive content or disclosing confidential information, particularly in systems that accept unchecked input.

An instance from this year demonstrated a security concern with Microsoft’s Bing Chat, where a student was able to unveil the chatbot’s hidden prompt instructions. Similarly, another researcher discovered vulnerabilities in ChatGPT through a third-party prompt injection, potentially allowing unauthorized access to YouTube transcripts.

The NCSC emphasizes that prompt injection attacks might result in tangible real-world consequences if systems aren’t designed with security in mind. The susceptibility of chatbots to manipulation and prompt misuse raises the risk of attacks, scams, and data breaches.

The agency advises proactive measures to address this issue: “Prompt injection and data poisoning attacks can be extremely difficult to detect and mitigate. However, no model exists in isolation, so what we can do is design the whole system with security in mind.”

The NCSC proposes integrating a rules-based system alongside machine learning models to prevent damaging actions, even if prompted to perform them. In essence, the agency underscores the importance of building secure systems that counteract the inherent vulnerabilities of machine learning algorithms, ultimately mitigating the potential cyber threats originating from AI and machine learning.