I had a chance recently to revisit ANCL in two ways. I recently had to compare some firewall rules between two different firewalls that were setup to mirror each other. The original firewall was not set up using any modeling of firewall rules so very much fell under the issues that I originally commented on. Lessons learned:
- It’s much easier to collect/reason about the communication pattern when you look at it from the “what do I need to talk to” perspective instead of the “what talks to me” perspective. People are blocked when their downstreams aren’t working and so have a bit more motivation to make sure those are well described.
- Even JSON is a bit more verbose than that I wanted to deal with when working on the rules in mass.
- To make it happen, I simplified and didn’t attempt to build out any kind of hierarchy or dependency. Even though some would stem from the same model, I manually created the instantiation of those models. I don’t have a good approach for this yet, but working through the concrete example gave me a better understanding.
- It’s easy to use IP address in the context of firewalls, and you can overlap anything that has that address or containing CIDR.
- Naming is hard (1): There seems to be a bit of redundancy with model roles. If I have service, what do I specify for clients and what do I specify for the port which the clients connect to? In both cases, I want to use “client”
- Naming is hard (2): It’s still not clear to me what to use to describe the generic descriptions (e.g. models), the components in those descriptions (e.g. roles), and the instances of items in those roles (e.g. nodes?). I keep using the term roles in the place of the nodes - I think.
Separately from the firewall, I’ve been looking at using this to help figure out overall communication matters. I’m trying to bridge together different applications running by different groups and using different interconnect mechanisms. I need to get quality and slot information for what’s talking to what. That’s got me thinking. A few more items to postulate:
- Mental exercise: How does routing information play into all of this? Does different routing affect how the models are structured?
- Mental exercise: Can I use the models to influence aggregation and reporting on netflow data. Each netflow entry could be associated with a specific model which gives a lot more context than protocol, port, and subnet which ends up being the bulk of what I usually see?
- Mental exercise: What’s it looks like to add additional information to each model? Not just “443/tcp” but also “100Mbps”, “TLS”, and “client certificate required”?
- Mental exercise: In the first model, I associated roles to specific IPs. What’s it look like when instead of IPs, I use AWS instance IDs, or AWS security group IDs, or container IDs, processes, etc?
So, there’s a lot more interesting stuff beyond just the firewall, and it’ll be interesting to see what comes up. But I still worry about the complexity, so I want to figure out ways to reduce that complexity.
The first one is to not have specific models (“this application’s Oracle DB”) for everything and to be able to use more generic models (“Oracle DB communication”). This means having an ability to reference a model. I’m still not sure how to do that. So, I’m trying to take a step back and come up with some use cases to help noodle through this. So with that in mind, the remainder of this is about examining that. I’m not committing to anything so you’ll see possibly a few implementations below.
Simple 3 Tier Application
This is your classic three tier application.
A sample general model could look like:
This is the case of the same model being applied in two different context with one overlapping resource. The example is a shared DB resource (here shared between dev and prod, but probably shared across multiple DB)
A fully expanded model could look like:
However, in reality, there’s a base model which looks like just:
The question is really about how to relate multiples together. Looking at roles:
This works in this simple example, but I’m not sure it covers everything (see below).
Same model applied to node as two different roles
This is one that masks quite a bit so it’s not clear what the perfect setup is. The simple case is that there’s a DB that serves sqlnet, but in turn also connects to other DBs using sqlnet (e.g. replication).
This could looks like:
The “db-server” and “db-client” part feels a bit weird. I kinda want to just have “server” and “client” but then feel like I need another name hierarchy - e.g. “db::client” and “db::server” - so the roles would look like:
This looks ok, but there’s two concern for me:
- How many things are “client” or “server” so would there be a way to simplify that?
- Having to have a context for all of the directed pairings seems a bit overdone. Is there a way to simplify that?
The latter concerns me more. Maybe not using the pairwise, and looking at the context to be a bit more on the node (in this case) itself:
ro1-db: [main::db::client, ro1::db::server]
Node as Multiple models
This is the case of having a node participate in multiple models. The example is that the node is part of its main role (app or db), but it’s also being monitored and logged into (so, “adminee” controlled by “adminbox”).
(insert app/db models above)
With an example of multiple roles put together:
This is more of “if there’s overlapping attributes it needs to get the roles of any roles that match that attribute.” Simple example of overlapping IP addresses/CIDRs:
In this case, 192.168.1.50⁄32 has both [adminbox,adminee]
Some models are a bit self referential. Nodes of the same role will talk to each other (cluster members). Nodes of the corresponding role (cluster members in differnt subsections of the cluster) will talk to each other in another way. The post child for this is Cassandra:
So, a model might look like:
And the roles might look like:
I’m actually surprised by this model. It seems to be one of the cleanest but it’s also pretty complex. Feels like a trap but I’m not seeing it yet.
Uh… distinct items?
I’m having trouble describing this one, and a bit about reasoning about it.
The general idea is that there are cases where you need to have a general pattern, but replicated a lot of times with specific contexts. The simply example would be to have 30 nodes - each of which have a self referential pattern that only refers to them. This is kinda like the Cassandra situation with the subtle distinction each Cassandra node talks to all other Cassandra nodes and in this case each node would only talk to itself. Effectively each node is its own context for a role (for as ugly as that sounds) that follows the pattern.
There’s two practical answers for this right now:
- Since it’s self-referential, it actually is unlikely to be needed to be defined (most people can talk to themselves and processes are probably listening on localhost anyways - which has overlapping IP space and thar be dragons with trying to reason down that one right now).
- You can enumerate each as a separate context - this seems like a workaround, but it at least allows for it, just not efficiently.
So, that may be enough of a starting point.
I think that’s enough for now. Definitely something to help ponder through all of this…