Member-only story
Hacking Chat GPT and infecting a conversation history
Large Language Models such as ChatGPT, Bing AI, and Jasper have taken over the news due to their capabilities in assisting with various tasks, from writing emails to coding assistance. However, amidst the excitement, it is essential to consider the protection of the data that we share with these tools.
Data breaches, such as the recent OpenAI leaks, have raised concerns about the security of LLMs. While such breaches have been happening for decades, a new issue has emerged with these models: prompt injection attacks.
In this article, we delve into the details of prompt injection attacks, and demonstrate how they can affect the security of our data by infecting the our conversations with these tools.

Disclaimer: The information and materials provided in this resource are intended solely for educational and informational purposes. The purpose of sharing this knowledge is to promote learning, awareness, and understanding of various technologies and systems. This resource is not intended to support, endorse, promote, or facilitate any form of hacking or unauthorized access to computer systems or networks.
What is prompt injection?
Large Language Models use text inputs from the user, process the inputs, and then reply with a text output. This is what allows the “conversation” to take place between the user and the tool.
Let’s take an example. As part of my daily task, I would ask ChatGPT to summarize an article.

Let’s now imagine that this article was somehow altered by an attacker, to include rogue instructions for ChatGPT.

As we see, instructions located inside the article have influenced ChatGPT’s behaviour, here by simply modifying the output text.