InfoSec Write-ups

A collection of write-ups from the best hackers in the world on topics ranging from bug bounties and CTFs to vulnhub machines, hardware challenges and real life encounters. Subscribe to our weekly newsletter for the coolest infosec updates: https://weekly.infosecwriteups.com/

Follow publication

Hacking Chat GPT and infecting a conversation history

Tom Philippe
InfoSec Write-ups
Published in
6 min readJul 1, 2023

--

Large Language Models such as ChatGPT, Bing AI, and Jasper have taken over the news due to their capabilities in assisting with various tasks, from writing emails to coding assistance. However, amidst the excitement, it is essential to consider the protection of the data that we share with these tools.

Data breaches, such as the recent OpenAI leaks, have raised concerns about the security of LLMs. While such breaches have been happening for decades, a new issue has emerged with these models: prompt injection attacks.

In this article, we delve into the details of prompt injection attacks, and demonstrate how they can affect the security of our data by infecting the our conversations with these tools.

Disclaimer: The information and materials provided in this resource are intended solely for educational and informational purposes. The purpose of sharing this knowledge is to promote learning, awareness, and understanding of various technologies and systems. This resource is not intended to support, endorse, promote, or facilitate any form of hacking or unauthorized access to computer systems or networks.

What is prompt injection?

Large Language Models use text inputs from the user, process the inputs, and then reply with a text output. This is what allows the “conversation” to take place between the user and the tool.

Let’s take an example. As part of my daily task, I would ask ChatGPT to summarize an article.

Article to summarize

Let’s now imagine that this article was somehow altered by an attacker, to include rogue instructions for ChatGPT.

The original command was modified by the content of the article

As we see, instructions located inside the article have influenced ChatGPT’s behaviour, here by simply modifying the output text.

Why is it a problem?

--

--

Published in InfoSec Write-ups

A collection of write-ups from the best hackers in the world on topics ranging from bug bounties and CTFs to vulnhub machines, hardware challenges and real life encounters. Subscribe to our weekly newsletter for the coolest infosec updates: https://weekly.infosecwriteups.com/

Written by Tom Philippe

Security researcher, Hacking passionate, Head of Offensive Security at Responsible Cyber

Responses (2)

Write a response