I recently had a discussion around the X-Forwarded-For header and common usage. This wasn’t the first time I’ve had the discussion, and probably won’t be the last. I’m going to jot down some thoughts for future me and future others.
For the record, this is the perspective from running online services - so the focus is on the incoming requests.
tl;dr: X-Forwarded-For is not as standard as people believe, and even where it is standard, it’s not the standard that people think it is. And don’t use it for server-to-server calls - just for proxies.
I have to ask three questions every time I see it in use:
- Can I believe that the client (or middleman) was truthful about the value?
- Can I believe that nothing in the middle messed up the handling of the value?
- Is the service making the call in just a “proxy”?
For the first one, let’s go back to the first rule of internet fight club - don’t trust anything from the client. This can easily be spoofed, so I have to remember to strip it at my front door.
As for the second question, let’s just say that the chances are not insignificant that something isn’t handling it correctly. The current RFCs for HTTP 1.1 (2616 and the new proposal 7230) both allow for multiple headers as long as the header value is a list:
A sender MUST NOT generate multiple header fields with the same field name in a message unless either the entire field value for that header field is defined as a comma-separated list... https://tools.ietf.org/html/rfc7230
Most make the assumption that it’s on one line, and it’s comma separated (and I even had a case where some assumed it was a single value). The truth of the matter is that there are plenty of bugs in well know projects which don’t handle this correctly. There are more bugs in internal code.
This second question might have been handled by rfc7239, but it seems to contradict rfc7230. On the one hand, it’s header format is no longer just a list; on the other hand, rfc7239 explicitly permits (CAN) multiple headers. So, my vote is still out here.
So, regarding the first two questions, at all of your inspection points (including the implicit ones which are easily overlooked), you have to make sure the XFF is being handled correctly. If not, it’s worse than useless; it’s dangerous.
The philosophical question…
The last question is a bit harder to noodle through. What does it mean to be “forwarded”? The context here is a lot of proxying of requests where it is expected that the proxy isn’t making a meaningful change to the request itself (adding some tracking headers, converting from HTTPS to HTTP, caching, etc).
This is different than the case where one service is calling another service. The source service isn’t really proxying the request as it’s making a new request on behalf of the client. Even in the newer RFCs, I haven’t found a clear definition of “proxy” so I don’t think there’s a formal answer.
This may be a subtle distinction, but the meaning has consequences for how you manage it. When doing controls, there usually can only be one source: one value that gets used to compare.
As a simple example, I only want requests from a specific geography to come in, and I’m servicing both clients and other services. I have to decide if I want that geo restriction to apply to the original client or the geo of the last connection, which could be a service. If I choose last connection, then I’m going to be shutting down a lot of clients in that geography because they are dependent on a service in another geography. If I choose clients, in addition to making sure the client info gets to me, I have to make sure that the services are good handling that split good/bad responses.
In the case of rate controls, I have to make the same decision, and that’s got its own issues that I probably want both - one rate control for end clients, and another loose rate control for partner services. Are you supposed to parse the XFF chain and figure out which came from which? Can you even apply some
This leads me to the mindset that XFF should only be used to show a source connection via proxy, and something else should be used for requests made as part of a services chain: “X-Requested-For” or something similar.