I don't know how I missed James Lewis' talks about Scale, Flow and Microservices - or really Complex Adaptive Systems. If nothing else, I recommend watching one like this (there are quite a number of videos available on Youtube). It talks about how organizations scale, age, and die and is based on complexity theory and science that applies to all sorts of examples outside of software (cities, mammals).
It really got me thinking about how security fits in at organizations that are following these sorts of models. This post is going to explore that and how we in security can adapt to help our organizations scale better. I'd love to hear from you to discuss further.
Classically, when we think about some of the lean enterprise and organizational structures around decentralized micro-services based organizations, we tend to think of DevSecOps. In my head, this is a fancy jargon term for something like:
Everyone on the team has shared responsibility for making each feature meet a real need, building it well (Dev), deploying it in a stable way, and ensuring that it operates correctly (Ops). Oh and we should be doing security in that mix as well.
- Me 🙂
It also ties to a term a lot of people are using Pushing Left, which basically means doing security as early in the development process as possible. There are a lot of sort of implied or commonly accepted things in DevOps that help us do security more nimbly:
Both of these concepts (DevSecOps and Pushing Left) can help an organization maintain its flow state and continue to execute better overall, because the tools are running often and seamlessly and the people are somewhat integrated. They are also well suited for decentralized organizations. Since security is in the DevOps (DevSecOps), in theory you have a way to weave security into each team.
But in most cases I see in the real world, this breaks down in several ways:
One of the most common things I see with DevOps is that organizations take their Operations team, teach them Terraform (or some other infrastructure as code variant) and call them the DevOps team.
This is both unfair to these people (because suddenly they are writing code for all sorts of infrastructure that didn't used to be their responsibility) and also goes against the spirit of DevOps. If the ops isn't embedded with development and the dev team doesn't have shared responsibility for delivery and operations, then it really isn't DevOps.
Extending this to DevSecOps, most often I see this as a small DevSecOps team that helps other actual Dev teams put automation into their CI/CD process. It is adjacent to what I think DevSecOps should be, but it isn't it.
Security should be something we're thinking about each step of the way, as we write requirements, write test cases, do infrastructure as code, deploy, monitor, etc. If it is a separate team, it is nearly impossible to be involved in all of these things.
I don't necessarily see a future where security doesn't have to exist as a first class organization, separately from distributed implementation teams. There are certain things that roll up and are hard without a single central point for communication to key stakeholders, or which require a first class team to focus on them, eg.:
But, and this is a big but, I think the idea of a large security organization doing a lot of work managing the security of separate IT and Software systems is a model that is doomed to fail as a complex system.
One easy thing to look at is change control. Amusingly, the Lewis talk mentions the following:
Change request boards are a predictors of low performance.... You are deliberately blocking your organizations arteries with meetings.
James Lewis - MicroCPH Keynote: Flow, Microservices and Scale
If you think about it, this isn't a surprise. I've seen organizations where the change board only meets once every two weeks! So you can only get changes out very infrequently. Sometimes there is a backlog of changes and they can't all even be considered in the next change board meeting! Further, the change board often includes a lot of important people from across the organization. This means people feel there is a high bar for the submission and the preparation of a change request can become onerous. Documenting in detail who requested it, what is going to happen, what the rollback plan is, etc.
We typically recommend a much simpler and more distributed process where changes are approved through pull requests to certain branches.
Sometimes this means a branch has a PR to main (feature/my-cool-feature->main
) which is what is deployed.
Others this means there is a branch to a main code branch to a deploy branch: feature/my-cool-feature->main
and then main->prod
where the ->prod
PR is the one that actually triggers deployment.
In either case, the following are true:
As long as we take the review seriously and think through potential other issues it could cause, we've basically done the same thing the change review board would have done but with much less waiting and much fewer resources expended.
One key to making this work is that the different areas that can have parallel independent change processes must be decoupled. In this case, Lewis is advocating that using microservices properly (specifically where different teams interact through a contract expressed in an interface, not documentation) enables this decoupling which turns out to help an organization maintain speed and innovation.
Both of the following are from James' talk. The first on the left, illustrates a smaller hierarchical graph and then a larger hierarchical graph and is essentially intended to illustrate how the stress on a company grows with the depth of the hierarchy. On the right, the picture illustrates a network of small networks. Turns out, if the organization can work this way then it can grow faster as it scales rather than slower as in the hierarchical example.
Again, I really recommend just going to the source (The James Lewis talk) to get the full idea here, it is fun and well delivered.
But what does this mean for security? My first instinct is to design processes and embed people that look like a node in the small graph or which can be done in the context of the small graph. I can already hear the pushback that this would mean having 5 security folks to support 5 teams and that is just not a realistic resource allocation. Maybe that's true. Maybe they could be split. Maybe in the true spirit of DevOps, the teams should be taking on the idea of security and doing it fully as a collaborative part of what they do. I'm not sure I know the right answer here yet, but it makes for some interesting thought processes.
What are some other examples of doing security well in a decentralized org?
Some are easy. Thinking about security requirements should be mostly easy on a team by team basis. Security unit tests are only really possible on a team by team basis. TrainingIn the context of cybersecurity, training refers to educating employees, contractors, and other stakeholders about security best practices and policies. This can include training on how to recognize and avoid common phishing and social engineering attacks, how to create strong passwords and use multi-factor authentication, how to handle sensitive data, and how to respond to security incidents. Effective training programs are ongoing and can help organizations reduce the risk of human error and improve overall security posture. is another great example of something that can and probably should be tailored for each team but often isn't.
Some are harder.
Central logging? We commonly talk to people that want to aggregate more different log sources together so that they can enable their security operations center (SOCService Organization Controls (SOC) are a set of standards developed by the American Institute of Certified Public Accountants (AICPA) to help organizations assess and report on the effectiveness of their internal controls. SOC reports provide assurance to customers and stakeholders that service organizations have appropriate controls in place to protect sensitive data and assets.) to have broad visibility. In practice, these application signals are rarely actually understood or analyzed by the SOC. It would be more efficient to task the dev teams with building their own visibility and having a way to escalate events that are truly suspicious.
Common CI/CD Process and Security Tooling? It is very attractive to have security advocate for a common toolchain for doing builds, running tests and then running additional security tooling. However, when security starts trying to dictate tooling - even things like SAST or SCA tooling, it often becomes less well adopted than when developers have responsibility themselves. How often do we see a tool that is running but developers don't see or act on the results? Often these tools produce terrible value and developers are right to ignore them. But if they were invested and helped set them up and tune them, maybe they would get better results. Sure we can get a discount per seat if we buy more seats, but maybe the best model is to accept the inefficiency of fragmented tools and just let teams go fast in their own way.
Security policies and processes? On the policy side, I almost always see consolidated policies, but then they aren't generally followed very actively by development teams. Processes are similarly documented but only followed when needed and by teams that must follow them. Write once read never often applies. It seems obvious to me as I think about it, that it could be more effective to have much more diversity of policy and process so long as it effectively met the goals of the teams. It seems more likely to do this if it is build and managed by the teams. This takes time and money of course, but can be expected to succeed, unlike a monolithic top down program.
As I think about more different security controls that apply at a team level (see also stuff I've written about AppSec programs) it seems like being able to reason flexibly and let teams run with whatever works well for them could be a big win at many companies!
It is very tempting in security to think that if we standardize, we'll make the organization better. This may not be the case. It may be better to learn to live in a decentralized security context. This makes me want to explore tooling that would help support this sort of decentralized team reporting approach... maybe more to come there?
The core references for this post are the following: