20 Feb 2017, 12:37

ancl connecting multiple models

One use case not previous documented is the one of connecting two models together. This can be used to cover some of the combinations of the same model applied to a node in different ways and node in multiple models and self-referential. In our previous models, we fully connected everything or made it orthogonal (“node in multiple models”). The key difference here is that all of the previous ones don’t fundamental change the model world - it’s a matter of making sure that a single model at a time is applied to the nodes.

This is a matter of connecting two models in a way that could be seen as making a new unified model.

Note: below, I’m going to be using two new notations: a. There’s an added yaml dictionary level for the model name. b. Some of the model details (specifically: target ingresses) may be purposefully left out.

The simple example of an app talking to Cassandra:

appmodel:
  client:
    egress: [[app,appapi]]
    ingress: {}
  app:
    egress: [[db,binary]]
    ingress:
      appapi: [8009,8009,"tcp"]
  db:
    egress: []
    ingress:
      binary: [9042,9042,"tcp"]
cassandra:
  client:
    egress: [[db,binary]]
    ingress: {}
  local-server:
    egress:
    - [server,plain-gossip]
    - [remote-server,encrypted-gossip]
    ingress:
      binary: [9042,9042,"tcp"]
      plain-gossip: [7000,7000,"tcp"]
      encrypted-gossip: [7001,7001,"tcp"]
  remote-server:
    egress:
    - [local-server,encrypted-gossip]
    ingress:
      encrypted-gossip: [7001,7001,"tcp"]

appmodel and cassandra

One way to handle this is to not. Basically, avoid some of the overlaps here - e.g. lose the db role inside of the appmodel. Make it look like:

no overlap

Then you could say for role assignments:

appnode01: [appmodel::app,cassandra::client]
cassadranode01: [cassandra::local-server]

While this works, it doesn’t seem like a good idea. Without the additional role assignment context, you don’t know that the app node requires DB access. The model really is incomplete.

Another approach is to be able to link the two models - to say that nodes in one model are equivalent to nodes in another model.

linked

There’s a couple of ways to do this.

  1. Connect the edge: The overlap of edges is to recognize that appmodel::app----binary---->appmodel::db::binary is the same as cassandra::client----binary---->cassandra::local-server::binary
  2. Connect the node pair: Similar but just say appmodel::app == cassandra::client and appmodel::db == cassandra::local-server

The first might be more accurate assessment of what it actually is, but it is more verbose and I’m not sure it actually achieves any real difference. So, just going with this, a linkage might look like:

[appmodel,cassandra]:
- [app,client]
- [db,local-server]

This doesn’t look like good yaml. I’m not sure every parser can handle it, so this might change. The key is to be able to descern the tuple.

Implementation

The interesting part is that this can be implemented without fundamentally changing anything previous. The only piece to add is to extend the roles automatically - wherever there’s a linkage, add the linkage role equivalent to the node’s roles.

appnode01: [appmodel::app]
cassandra01: [cassandra::local-server]

becomes

appnode01: [appmodel::app,cassandra::client]
cassandra01: [cassandra::local-server,appmodel::db]

Inferred model roles

Above, the db::binary and local-server::binary are redundant: the port information is defined twice. In reality, the appmodel is more concerned about the app itself and less concerned about the specifics of the DB; it just cares that there is one. In the case of knowing that there will be linkage, you could consider stubbing out the role that is going to be the crux of the linkage. This might look like:

appmodel:
  db:
    egress: []
    ingress:
      binary: false

From here, it says “there’s a db role, but it needs a linkage to make it concrete.”

Given that the stub isn’t really saying much that isn’t already implied by the egress on the “app” role, it’s possible to not even define the db role - not even having a “db” dictionary key - and inferring that there’s a “db” role. Similarly, since the “client” role only has the egress which is references elsewhere, it’s possible to infer that as well.

From an automatic aspect, this means that the role expansion could be limited. However, since both sides are likely to have something stubbed out or inferred, it could be easy for this to fall into the trap of not expanding either side and we’re back to step one. So, one side has to remain. Since the realizations are going to happen from the target side, I think it makes sense to keep the source side roles. This means the above roles would look like:

appnode01: [appmodel::app,cassandra::client]
cassandranode01: [cassandra::local-server]

The added bonus of the inferred roles is that they can be automatically replaced when processing. This might make realizing the configuration output a little bit easier. E.g. don’t end up with many incoming policies for the cassandra::local-server connection; end up with one (since all of the inferred target roles are not expanded) with multiple sources (those roles are expanded) for each linked context/model.

Self-referential update

This can be used to connect two cassandra data centers together (dc1,dc2). The linkage:

[dc1::cassandra,dc2::cassandra]:
- [local-server,remote-server]
- [remote-server,local-server]

This is basically saying that in the dc1::cassandra context, dc2::cassandra::local-server is a remote-server, and vice-versa. So node assignments go from:

cassandra-dc01: [dc1::cassandra::local-server]
cassandra-dc02: [dc2::cassandra::local-server]

to:

cassandra-dc01: [dc1::cassandra::local-server,dc2::cassandra::remote-server]
cassandra-dc02: [dc2::cassandra::local-server,dc1::cassandra::remote-server]

This seems to work since the linkage doesn’t create any redundant roles in this use case. That may fall down if there is, but I’m not seeing how that would come up just yet.

Not done yet?

The downside of this is that since no new actual model is defined, contexts aren’t automatically done. So, the above works for what it is, but the linkage can’t necessarily be promoted. It can be, but that might be overdoing what should be done. This has some impact on the linkage of models vs linkage of context+models.

So, I have to go model a bunch of stuff at work and see if this can express in a simple way everything that I need there.

18 Feb 2017, 15:03

ANCL: Use Cases

I had a chance recently to revisit ANCL in two ways. I recently had to compare some firewall rules between two different firewalls that were setup to mirror each other. The original firewall was not set up using any modeling of firewall rules so very much fell under the issues that I originally commented on. Lessons learned:

  1. It’s much easier to collect/reason about the communication pattern when you look at it from the “what do I need to talk to” perspective instead of the “what talks to me” perspective. People are blocked when their downstreams aren’t working and so have a bit more motivation to make sure those are well described.
  2. Even JSON is a bit more verbose than that I wanted to deal with when working on the rules in mass.
  3. To make it happen, I simplified and didn’t attempt to build out any kind of hierarchy or dependency. Even though some would stem from the same model, I manually created the instantiation of those models. I don’t have a good approach for this yet, but working through the concrete example gave me a better understanding.
  4. It’s easy to use IP address in the context of firewalls, and you can overlap anything that has that address or containing CIDR.
  5. Naming is hard (1): There seems to be a bit of redundancy with model roles. If I have service, what do I specify for clients and what do I specify for the port which the clients connect to? In both cases, I want to use “client”
  6. Naming is hard (2): It’s still not clear to me what to use to describe the generic descriptions (e.g. models), the components in those descriptions (e.g. roles), and the instances of items in those roles (e.g. nodes?). I keep using the term roles in the place of the nodes - I think.

Separately from the firewall, I’ve been looking at using this to help figure out overall communication matters. I’m trying to bridge together different applications running by different groups and using different interconnect mechanisms. I need to get quality and slot information for what’s talking to what. That’s got me thinking. A few more items to postulate:

  1. Mental exercise: How does routing information play into all of this? Does different routing affect how the models are structured?
  2. Mental exercise: Can I use the models to influence aggregation and reporting on netflow data. Each netflow entry could be associated with a specific model which gives a lot more context than protocol, port, and subnet which ends up being the bulk of what I usually see?
  3. Mental exercise: What’s it looks like to add additional information to each model? Not just “443/tcp” but also “100Mbps”, “TLS”, and “client certificate required”?
  4. Mental exercise: In the first model, I associated roles to specific IPs. What’s it look like when instead of IPs, I use AWS instance IDs, or AWS security group IDs, or container IDs, processes, etc?

So, there’s a lot more interesting stuff beyond just the firewall, and it’ll be interesting to see what comes up. But I still worry about the complexity, so I want to figure out ways to reduce that complexity.

The first one is to not have specific models (“this application’s Oracle DB”) for everything and to be able to use more generic models (“Oracle DB communication”). This means having an ability to reference a model. I’m still not sure how to do that. So, I’m trying to take a step back and come up with some use cases to help noodle through this. So with that in mind, the remainder of this is about examining that. I’m not committing to anything so you’ll see possibly a few implementations below.

Use Cases

Simple 3 Tier Application

This is your classic three tier application.

Client->Web->App->DB

A sample general model could look like:

client:
  egress:
  - [web,webapi]
  ingress: []
web:
  egress:
  - [app,appapi]
  ingress:
    webapi: [443,443,"tcp"]
app:
  egress:
  - [db,sqlnet]
  ingress:
    appapi: [8009,8009,"tcp"]
db:
  egress: []
  ingress:
    sqlnet: [1521,1521,"tcp"]

Shared DB

This is the case of the same model being applied in two different context with one overlapping resource. The example is a shared DB resource (here shared between dev and prod, but probably shared across multiple DB)

prod/dev share db

A fully expanded model could look like:

dev-app:
  egress:
  - [db,sqlnet]
  ingress: {}
prod-app:
  egress:
  - [db,sqlnet]
  ingress: {}
db:
  egress: []
  ingress:
    sqlnet: [1521,1521,"tcp"]

However, in reality, there’s a base model which looks like just:

app:
  egress:
  - [db,sqlnet]
  ingress: {}
db:
  egress: []
  ingress:
    sqlnet: [1521,1521,"tcp"]

The question is really about how to relate multiples together. Looking at roles:

prod-app: ["prod::app"]
dev-app: ["dev::app"]
db: ["prod::db","dev::db"]

This works in this simple example, but I’m not sure it covers everything (see below).

Same model applied to node as two different roles

This is one that masks quite a bit so it’s not clear what the perfect setup is. The simple case is that there’s a DB that serves sqlnet, but in turn also connects to other DBs using sqlnet (e.g. replication).

main-db -> ro1-db -> ro2-db

This could looks like:

db-client:
  egress:
  - [db-server,sqlnet]
  ingress: {}
db-server:
  egress: []
  ingress:
    sqlnet: [1521,1521,"tcp"]

main-db: [main2ro1::db-server]
ro1-db: [main2ro1::db-client,ro12ro2::db-server]
ro2-db: [ro12ro2::db-client]

The “db-server” and “db-client” part feels a bit weird. I kinda want to just have “server” and “client” but then feel like I need another name hierarchy - e.g. “db::client” and “db::server” - so the roles would look like:

main-db: [main2ro1::db::server]
ro1-db: [main2ro1::db::client,ro12ro2::db::server]
ro2-db: [ro12ro2::db::client]

This looks ok, but there’s two concern for me:

  1. How many things are “client” or “server” so would there be a way to simplify that?
  2. Having to have a context for all of the directed pairings seems a bit overdone. Is there a way to simplify that?

The latter concerns me more. Maybe not using the pairwise, and looking at the context to be a bit more on the node (in this case) itself:

main-db: [main::db::server]
ro1-db: [main::db::client, ro1::db::server]
ro2-db: [ro1::db::client]

Node as Multiple models

This is the case of having a node participate in multiple models. The example is that the node is part of its main role (app or db), but it’s also being monitored and logged into (so, “adminee” controlled by “adminbox”).

(insert app/db models above)
adminee:
  ingress:
    ssh: [22,22,"tcp"]
    snmp: [161,161,"udp"]
  egress: []
adminbox:
  ingress: {}
  egress:
    - [adminee,ssh]
    - [adminee,snmp]

With an example of multiple roles put together:

prod-app: ["prod::app","adminee"]

Overlapping attributes

This is more of “if there’s overlapping attributes it needs to get the roles of any roles that match that attribute.” Simple example of overlapping IP addresses/CIDRs:

"192.168.1.50/32": [adminbox]
"192.168.1.0/24": [adminee]

In this case, 192.168.1.5032 has both [adminbox,adminee]

Self Referential

Some models are a bit self referential. Nodes of the same role will talk to each other (cluster members). Nodes of the corresponding role (cluster members in differnt subsections of the cluster) will talk to each other in another way. The post child for this is Cassandra:

Cassandra Fun

So, a model might look like:

client:
  egress:
  - [server,binary]
  ingress: {}
local-server:
  egress:
  - [server,plain-gossip]
  - [remote-server,encrypted-gossip]
  ingress:
    binary: [9042,9042,"tcp"]
    plain-gossip: [7000,7000,"tcp"]
    encrypted-gossip: [7001,7001,"tcp"]
remote-server:
  egress:
  - [local-server,encrypted-gossip]
  ingress:
    encrypted-gossip: [7001,7001,"tcp"]

And the roles might look like:

app-dc1: [dc1::cassandra::client]
app-db2: [dc2::cassandra::client]
cass-dc1: [dc1::cassandra::local-server,dc2::cassandra::remote-server]
cass-dc2: [dc2::cassandra::local-server,dc1::cassandra::remote-server]

I’m actually surprised by this model. It seems to be one of the cleanest but it’s also pretty complex. Feels like a trap but I’m not seeing it yet.

Uh… distinct items?

I’m having trouble describing this one, and a bit about reasoning about it.

The general idea is that there are cases where you need to have a general pattern, but replicated a lot of times with specific contexts. The simply example would be to have 30 nodes - each of which have a self referential pattern that only refers to them. This is kinda like the Cassandra situation with the subtle distinction each Cassandra node talks to all other Cassandra nodes and in this case each node would only talk to itself. Effectively each node is its own context for a role (for as ugly as that sounds) that follows the pattern.

There’s two practical answers for this right now:

  1. Since it’s self-referential, it actually is unlikely to be needed to be defined (most people can talk to themselves and processes are probably listening on localhost anyways - which has overlapping IP space and thar be dragons with trying to reason down that one right now).
  2. You can enumerate each as a separate context - this seems like a workaround, but it at least allows for it, just not efficiently.

So, that may be enough of a starting point.

coda

I think that’s enough for now. Definitely something to help ponder through all of this…