Art of hacking LLM apps

Published in

InfoSec Write-ups

5 min readJul 3, 2023

I am Harish SG, a security researcher who studies Masters in Cybersecurity at UT Dallas,previously hunted on the Microsoft Bug Bounty Program and Google VRP.

I am sharing this article for security awareness and educational purposes only.

In this article! I am explaining about important vulnerabilities and attack surface in LLM applications.

Solve hidden LLM Hacking Challenges(Read this Article to bottom and send me flag and Solution ss of output in console

Solve LLM Prompt injection challenge (Mode 3) and send ss of output in console linkedin or Twitter via DM for more info checkout https://github.com/harishsg993010/DamnVulnerableLLMProject/tree/main

What is LLM?

LLM stands for “Large Language Model.” It refers to a type of artificial intelligence model designed to process and generate human-like language. Large Language Models are trained on massive amounts of text data and use advanced machine learning techniques to understand and generate natural language.

Models like OpenAI’s GPT (Generative Pre-trained Transformer) fall under the category of Large Language Models. These models learn patterns and relationships in language by training on diverse datasets, such as books, articles, websites, and other textual sources.

LLMs are capable of performing a variety of language-related tasks, including text completion, summarization, translation, sentiment analysis, question-answering, and more. They can generate coherent and contextually relevant responses based on the given input and are often used in applications involving natural language processing and understanding.

I will be focusing lot on following LLM vulnerabilities such as Prompt Injection, Sensitive data disclosure, SSRF , Unauthorised code injection , Lack of access control on LLM APIs.

What is prompt injection?

Prompt injection refers to the technique of providing specific instructions or context within the input or prompt given to a Large Language Model (LLM) to influence its output. It involves modifying the initial text provided to the model in order to steer it towards generating desired responses.

By carefully crafting the prompt, users can guide the LLM to produce outputs that align with their intentions. Prompt injection can involve various techniques

Examples of prompt injection attack on Damn Vulnerable LLM Application

In the above screenshot, I tricked the Damn Vulnerable LLM app to say the credit card details. Even though it does not show a valid credit card number, it suggested user an illegal way to generate a valid credit card number.

In the above screenshot, I tricked the Damn Vulnerable LLM app to say the valid Windows 11 activation keys

What is Sensitive Data Disclosure on LLM via Prompt injection?

Sensitive Data Disclosure via Prompt Injection refers to the potential risk of inadvertently revealing confidential or sensitive information by including such data within the input or prompt given to a Large Language Model (LLM).

Prompt injection involves providing specific instructions, context, or examples to the LLM to guide its output generation. However, if sensitive data is included within the prompt without proper precautions, there is a risk that the LLM may incorporate or disclose that information in its generated responses.

For example, if a user includes personal identifiable information (PII) such as names, addresses, phone numbers, or financial details within the prompt, there is a chance that the LLM may generate responses that include or expose that sensitive data. This could potentially lead to a breach of privacy, data leakage, or other security concerns.

Examples of Sensitive Data Disclosure on LLM via Prompt injection

In the above screenshot I tricked Damn Vulnerable LLM app to share AWS key and SSN number of some random person

Challenge : Find the hidden password and kubernetes creds via prompt injection and DM me that via Twitter or linkedin to verify along with screenshot

Note: All the of them are dummy number for demonstration purpose.

Unauthorised code injection on LLM

Unauthorized code injection on a Large Language Model (LLM) refers to the act of injecting malicious or unauthorized code into the input or prompt provided to the LLM, with the intent of executing arbitrary code or compromising the security of the system.

LLMs are powerful language models that can generate text based on the input they receive. However, if input validation and sanitization are not properly implemented, an attacker may attempt to inject code that the LLM interprets and executes, leading to security vulnerabilities.

Example for Unauthorised code injection on LLM

In the above screenshot, I tried triggered an hidden code injection vulnerability via word “command” in damn vulnerable LLM app which ran ls and whoami commands in commad line in this example!

Challenge : There is another code injection in damn vulnerable LLM app and find it if you can

Lack of access control on LLM APIs

Lack of access control on LLM APIs refers to a security vulnerability where there is inadequate or insufficient control over who can access and use the application programming interfaces (APIs) provided by Large Language Models (LLMs). It means that there are no proper mechanisms in place to authenticate or authorize users or applications that interact with the LLM API.

Example for Lack of access control on LLM APIs

In this above screenshot! Developer of Damn Vulnerable LLM hardcoded API secret and there is no authentication for muliple Users for API and there is no access control for the API. This is simple example for demonstration.

Mitigation against these vulnerabilities in LLM

don’t share senstive data such as password , SSN , private key etc on training data and while using this application.
sanitize input to defend against injection vulnerabilities.

I developed an project called Damn Vulnerable LLM App which is vulnerable by design. you can access project from following link and you need to have Open AI API key and add it to secrets.

Link to project repository:https://replit.com/@hxs220034/DamnVulnerableLLMApplication-Demo

I recommend you guys to fork and play with code.

Vulnerable LLM Application is trained against data fom bitcoin white paper , ethereum white paper , dummy keys and passwords, wikipedia data on dogs and cat.

Credits : This code is based on LLM App developed by Zahid Khawaja

Follow me on twitter : https://twitter.com/CoderHarish

Follow me on linkedlin : https://www.linkedin.com/in/harish-santhanalakshmi-ganesan-31ba96171/

website : harishsg.com