A few days ago, the Kubernetes community announced a new vulnerability:
This is a normal event with any software package. Vulnerabilities are found, fixed and announced, but not necessarily in this order.
As a Kubernetes operator or a security manager, you should be watching for such announcements, for example by subscribing to this RSS feed: https://groups.google.com/forum/feed/kubernetes-announce/msgs/rss_v2_0.xml?num=50
As soon as you hear about such a vulnerability you should upgrade to a new version that fixes it. In this case the fix is provided by upgrading the underlying CNI package (Calico, Weave and others).
This catchup game is tricky, and upgrading is not always an easy option, especially when the cluster is running in production and serving a critical function.
But there is another option, and while it doesn’t eliminate the need to manage vulnerabilities, it is often more effective.
Locking down your systems by removing all unnecessary permissions can often provide proactive mitigation for many kinds of vulnerabilities.
In this case, for example, it would have been sufficient to remove the CAP_NET_RAW privilege from your containers to mitigate this risk. You can do that by adding these four lines to your container YAML:
But, in practice, most containers that we see in production are running with this privilege. Why? Because this is the Kubernetes default.
This is a fundamental problem of security policy management, and in Tufin we refer to this as the “Fine Line”:
The “fine line” is the idea that overly restricting a system risks business continuity (how can you be sure that the application can run properly without this permission), while on the other hand, granting excess permissions exposes you to the risk of a malicious attack.
As the person responsible for security, you are walking that fine line.
This is a universal problem which occurs in all fields of infosec: network security, identity and access management, system management, application security, data security etc.
So, what can you do?
First, you need a security manager. Regardless of the size of the operation, someone must own security.
Secondly, you need the knowledge: which permissions are risky and what the best practices are. This requires on-going research. And lastly, you need visibility into the running configurations.
This could be very time consuming, especially when the environment is large and heterogeneous with many technologies from different suppliers, and so, you usually end up using a tool to help you maintain a decent level of security.
Tufin SecureCloud was built for this purpose exactly, it provides visibility into Kubernetes environments, highlighting risky configurations and recommended mitigations, and it displays a dashboard with security scores to enable continuous improvement.
In this screenshot you can see a service which is both exposed externally and also has a container with the NET_RAW capability enabled: