08 May 2016, 22:22

X-Forwarded-For: You keep using that word...

I recently had a discussion around the X-Forwarded-For header and common usage. This wasn’t the first time I’ve had the discussion, and probably won’t be the last. I’m going to jot down some thoughts for future me and future others.

For the record, this is the perspective from running online services - so the focus is on the incoming requests.

tl;dr: X-Forwarded-For is not as standard as people believe, and even where it is standard, it’s not the standard that people think it is. And don’t use it for server-to-server calls - just for proxies.

Issues

I have to ask three questions every time I see it in use:

  1. Can I believe that the client (or middleman) was truthful about the value?
  2. Can I believe that nothing in the middle messed up the handling of the value?
  3. Is the service making the call in just a “proxy”?

For the first one, let’s go back to the first rule of internet fight club - don’t trust anything from the client. This can easily be spoofed, so I have to remember to strip it at my front door.

As for the second question, let’s just say that the chances are not insignificant that something isn’t handling it correctly. The current RFCs for HTTP 1.1 (2616 and the new proposal 7230) both allow for multiple headers as long as the header value is a list:

A sender MUST NOT generate multiple header fields with the same field
name in a message unless either the entire field value for that
header field is defined as a comma-separated list...

https://tools.ietf.org/html/rfc7230

Most make the assumption that it’s on one line, and it’s comma separated (and I even had a case where some assumed it was a single value). The truth of the matter is that there are plenty of bugs in well know projects which don’t handle this correctly. There are more bugs in internal code.

This second question might have been handled by rfc7239, but it seems to contradict rfc7230. On the one hand, it’s header format is no longer just a list; on the other hand, rfc7239 explicitly permits (CAN) multiple headers. So, my vote is still out here.

So, regarding the first two questions, at all of your inspection points (including the implicit ones which are easily overlooked), you have to make sure the XFF is being handled correctly. If not, it’s worse than useless; it’s dangerous.

The philosophical question…

The last question is a bit harder to noodle through. What does it mean to be “forwarded”? The context here is a lot of proxying of requests where it is expected that the proxy isn’t making a meaningful change to the request itself (adding some tracking headers, converting from HTTPS to HTTP, caching, etc).

This is different than the case where one service is calling another service. The source service isn’t really proxying the request as it’s making a new request on behalf of the client. Even in the newer RFCs, I haven’t found a clear definition of “proxy” so I don’t think there’s a formal answer.

This may be a subtle distinction, but the meaning has consequences for how you manage it. When doing controls, there usually can only be one source: one value that gets used to compare.

As a simple example, I only want requests from a specific geography to come in, and I’m servicing both clients and other services. I have to decide if I want that geo restriction to apply to the original client or the geo of the last connection, which could be a service. If I choose last connection, then I’m going to be shutting down a lot of clients in that geography because they are dependent on a service in another geography. If I choose clients, in addition to making sure the client info gets to me, I have to make sure that the services are good handling that split good/bad responses.

In the case of rate controls, I have to make the same decision, and that’s got its own issues that I probably want both - one rate control for end clients, and another loose rate control for partner services. Are you supposed to parse the XFF chain and figure out which came from which? Can you even apply some

This leads me to the mindset that XFF should only be used to show a source connection via proxy, and something else should be used for requests made as part of a services chain: “X-Requested-For” or something similar.

19 Mar 2013, 00:20

Firewall Rules from Models

I’m trying to put a little more meat to the bones of my ANCL discussion. Unfortunately, I can’t say that I have a tool for it all, yet, but there’s a bit more of the theoretical basis for it. Some of the key requirements forming around ANCL are:

  • It abstracts the components involved - roles are used instead of source IPs and destionation IPs.

  • It generalizes the communication patterns - models are built using the roles; these models can be used in different locations by identifying what composes the roles.

  • It can compose multiple models together - since the models are composed of roles, using the same role in multiple models connects them together (caveat in the second side note).

I was reviewing some slidedecks and came across Ript which has a good presentation. It is a powerful abstraction over iptables. Bonus: It’s abstraction can even be carried to other stateful packet filters, software or hardware.

Comparing it to ANCL, there’s several key differences:

  • A partition is not a connection of models, but a combination of them. It’s an administrative domain which lumps multiple unrelated connections together. This is by far the biggest difference and is a matter of the mental models.

  • The labels aren’t the same as roles. They’re good for identifying, but they aren’t then instantiated. There’s a single mapping, when multiple would be useful. This seems like an implementation detail that could be easily addressed.

  • It’s focus is on specific instantiations of the models. While the rules would be portable to different devices, this limits its ability to be ported to separate but similar situations. This seems like it is a matter of changing conventions (e.g. the naming of the labels), but might be more.

The first item is what it really comes down to is that these are two different mental and description models for the problem. Ript is a firewall abstraction language, and not necessarily an application communication description language. That being said, there are a great many other pieces of Ript, not the least of it being something concrete, that make it very useful and means I have some work here to move ANCL to something that can compare.


There’s also two parts that Ript has also incorporated which I’ll be honest that I don’t know how to incorporate into ANCL:

  • NATs and SNATs and other translations. In many ways, these are not critical to the application communication pattern, at least, not critical at an abstract level. It’s when it’s applied in specific contexts where translations come into play.

  • “Bad Guy” Rejects. Like transactions, these aren’t critical to application communication patterns - in fact, these are anti-communication patterns. Necessary at times, but not something that are accounted for when building the patterns.


The side note from above is that the roles aren’t completely what are connected to each other. Consider two models:

  1. A three tier application model which has a “webserver” role, an “application” role, and a “DB” role.

  2. A DB model which has an “application” role and a “DB” role.

As the application owner of the first model, I would elaborate the nodes in the “application” role. Most likely, the “DB” would be specified by the DBA. Conversely, as the DB owner, I’d elaborate the nodes in the “DB” role, and leave the “application” role to someone else. In either case, half of the equation is left unfilled. There’s (at least) two ways to approach this:

  1. Since the key is to connect the two edges for the “application”-“DB” together, the real role association happens there. So, what language should be used to describe these “edge roles”?

  2. The ambiguity can be taken away if we insert “connection points” or “service” points into each model. In this case, in the application model, we replace the “DB” role with a “DB” connection point, and in the DB model, we replace the “application” role with a “DB” connection point. When joining these models together, we overlay the connection points.

Not sure which is the right way, so that’ll probably comes down to implementation.