Runtime Security and the Role of eBPF/BPF-LSM

by Rahul Jadhav | January 31, 2024

Why did we bet the farm on it? The recently announced Cisco’s acquisition of Isovalent is a game changer, a once in a decade event that will have a lasting impact on Cloud Networking and Security. While the foundational innovations that led to eBPF (Extended Berkeley Packet Filter) were created at Facebook, Isovalent deserves immense […]

Reading Time: 10 minutes

Why did we bet the farm on it?

The recently announced Cisco’s acquisition of Isovalent is a game changer, a once in a decade event that will have a lasting impact on Cloud Networking and Security. While the foundational innovations that led to eBPF (Extended Berkeley Packet Filter) were created at Facebook, Isovalent deserves immense credit for creating a vibrant community and the technical innovations that has made eBPF the de facto kernel observability engine. With the power of Cisco, we believe that eBPF has the potential to be the next ubiquitous industry standard like TCP/IP, BGP, etc. Given that AccuKnox was an early adopter of eBPF and made substantial contributions, and given our immense traction with KubeArmor, Zero Trust In-line security capability, we are very enthused about the immense value we can add to addressing current and emerging security threats in Public Cloud, Private Cloud, Edge/IoT, 5G. The following blog adds additional depth and detail.

eBPF – what makes it unique

BPF (Berkeley Packet Filter) by McCanne and Van Jacobson was path breaking technology 20 years ago. eBPF took it to an entirely new level. Just as Javascript made the Browser programmable entity resulting in a rich user experience, eBPF enabled the kernel to be programmable, making it a very versatile platform for observability and security. eBPF enables user-defined policies to be converted into bytecode that can be safely injected directly into in-kernel hooks, allowing in-kernel dynamic decision-making. eBPF’s ability to impact policy enforcement at the earliest stages of decision-making results in a highly performant policy engine. All this combined with the security guarantees offered by the eBPF verifier provides for a primitive that can be safely used in production environments.

While eBPF was typically used in observability and monitoring scenarios, the extensions of kernel space hooks and bpf-helpers ecosystem allowed it to be used for network packet redirections/drops paving the way for in-kernel policy enforcement for networking scenarios. Further extensions using seccomp-bpf and BPF-LSM allowed it to be used in the context of systems-based policy enforcement as well.

Cilium – value add

Cilium was one of the first engines to realize the potential of such in-kernel decision-making for network packet processing and utilized it to its core. Kubernetes was grappling with scale issues induced by the heavy usage of iptables based rules. Cilium’s smart use of eBPF-based packet routing enabled smart Identity-aware in-kernel packet redirection that was specifically useful in Kubernetes scenarios. Further, Cilium’s use of eBPF for packet redirection for scenarios such as Direct Server Return allowed a reduction in the additional hops the packet has to traverse in case of multi-proxy processing scenarios which are widely used in the k8s world.

Our journey to eBPF

Our primary focus with AccuKnox during its inception was to tackle issues with Runtime security. We found that the runtime security tooling was not mature, and non-performant. Since most of the advanced attacks, zero-day attacks, and signature-less attacks are manifested during its runtime phase there was a need for a runtime security solution that not only provides forensics or observability but runtime policy enforcement. Furthermore, our fundamental philosophy was guided by Zero Trust i.e., how do we ensure the least minimum access at runtime and enforce it in the best possible way? We found eBPF to be the right primitive to be used in the context since it was already established for observability use-case and maturing for the enforcement use-cases.

Two dimensions stood out for us, Networks and Systems. AccuKnox was co-created in partnership with SRI International (previously Stanford Research Institute), they are our investor and long-term R&D partner. The team at SRI had been working on “Bastion” which enabled eBPF-based network policy control. This was roughly in July 2020 and we saw the emergence of another CNCF open source “Cilium” which had a similar design rationale. Even though Bastion was already developed by SRI, we found that the Cilium team was ahead in terms of feature set and managed platform support. Apart from having a strong core team, Cilium benefitted from the external contributors who extended/validated the Cilium engine across multiple platforms. By early 2021, we decided that there was no point in reinventing the wheel and certain delta difference features that Bastion had, it would be better to add it as part of Cilium itself. Also, unlike Calico, the Cilium team ensured that both the network policy enforcement and telemetry/alerts were available in the open-source version. Thus we shelved Bastion and decided to focus on Cilium. By the second quarter of 2021, the AccuKnox team had already started pushing changes to Cilium. AccuKnox has roughly 15 contributors who have made changes to Cilium. AccuKnox extended its policy discovery engine to auto-discover network and L7 policies based on Cilium/Hubble telemetry and offered it as part of its SaaS offering.

While Cilium was great, we knew that another dimension of “Systems hardening/protection” was left open. The fundamental execution unit for Kubernetes is a pod and an attacker who manages to compromise the pod would have unrestricted access to the cluster through that pod. We needed to come up with a solution to harden the system’s operation (such as Process executions, File Accesses, and Network Operations) within the pod to achieve a real Multi-Layer Zero Trust and Defense-in-depth approach. Within the pod, only certain apps/binaries need access to sensitive assets, only certain binaries need network and capabilities access. And the most important one, in the case of most pods, there were only certain binaries that needed execution access. We decided to build a systems runtime enforcement engine that leveraged LSMs for policy enforcement that could contain such behavior. LSMs such as AppArmor and SELinux have been in use for Host protection for the past few decades but they couldn’t find their way into Kubernetes security even though Pod Security Policies supported the use of AppArmor and SELinux policies. We studied the pain points such as the ephemeral and highly dynamic nature of pods on K8s, a steep learning curve of security policy language for folks who want to deploy security policies based on AppArmor and SELinux. We asked ourselves whether we could leverage the Kubernetes native resource model to orchestrate systems security policies that would eventually be implemented with the underlying Linux LSM/seccomp-bpf layer. Thus the KubeArmor came into being. I remember the discussion we had between me, Suchakra Sharma, and Jaehyun Nam. Suchakra made it very clear that if we want to deploy the policies using the right security principles then LSMs (Linux Security Modules) are the only model to go with. We all agreed that the users should not have to deal with the intricacies of AppArmor and SELinux and thus the user interface has to be through the k8s resource layer. Users should be able to add/delete/modify policies using the k8s resource layer and we would need an engine that would understand these policies and convert them into appropriate LSM policies under the hood. NSA Kubernetes Hardening Guide, released in March 2022, mentioned explicitly that “Hardening applications against exploitation using security services such as SELinux®, AppArmor®, and secure computing mode(seccomp)” added independent validation and credence to our approach.

Policy enforcement requires a set of policies and these policies are created based on Application behavior. KubeArmor leveraged eBPF observability to generate application behavior telemetry and policy violations that happen through the LSM (AppArmor/SELinux) layer. AccuKnox further built instrumentation in its enterprise solution that fed on KubeArmor emitted application telemetry to auto-recommend custom app-specific Zero Trust policies that contain the pod behavior to its minimum.

Challenges we faced…

With KubeArmor, our focus was building on the runtime security enforcement engine that used the right security principles. We did not want to compromise on the sound security principles that in-kernel primitives such as LSMs offered. There were two security principles namely, TOCTOU (Time to Check Time to Use) and Semantic Poisoning that we wanted to ensure that our policy engine does not suffer from. Secondly, we wanted to ensure that we do not use post-exploit mitigation techniques (essentially the detect-and-respond model). However, very soon we realized that it was a big challenge for us to ensure policy coverage across a plethora of platforms. SELinux was turning out to be a beast that was difficult to tame. GCP and Azure environments had AppArmor supported while EKS (Amazon Linux), and RHEL just had support for SELinux. It was extremely difficult to navigate the “type enforcement” rules that SELinux used. Changing a SELinux type-enforcement rule had a domino effect on all sorts of other SELinux rules and it was difficult to ensure that we were not opening up new security holes with the newly injected rules. In a few months, we understood that we had to ditch SELinux for container/pod-based policies. For hosts/k8s worker nodes, it was still manageable to use SELinux for host protection but it was infeasible for pods protection.

Without SELinux, the question for us was, “How to handle EKS and RHEL/OpenShift platforms which constituted more than 60% of our user deployments?”.

Chance favors a prepared mind

Most advances require both good insights and good fortune (quote from Douglas Osheroff). While we at AccuKnox had insights into the pains, we couldn’t alleviate some of the platform-specific issues with respect to the use of LSMs.

Then something changed in 2022!

BPF-LSM (or KRSI as it was called back then) was upstreamed in the Linux kernel by Google in 2019 and found its way into default distros in the managed cloud providers space. In 2021, less than 5% of distros supported BPF-LSM out of the box, in 2022 this changed to 50%, and by 2023 more than 80% of distros had support for BPF-LSM out of the box.

In 2022, Amazon was working on Amazon Linux 2022 (AL2022) which had BPF-LSM enabled by default. While AL2022 never made it to production in the year 2022, it was later renamed Amazon Linux 2023 (AL2023) and became the default production distribution on EKS. Bottlerocket followed suit. Surprisingly, even RHEL decided to backport BPF-LSM in RHEL >= 8.5 versions. All of a sudden, BPF-LSM was present everywhere. The KubeArmor team had spotted this trend early enough and had introduced BPF-LSM enforcement support in early 2022 itself. By the end of 2022, the BPF-LSM enforcer was mature enough that the KubeArmor team decided to make it the preferred enforcer (in cases where both AppArmor and BPF-LSM were available).

KubeArmor became the first security engine to operationalize generic user-specified policy rules for runtime security using BPF-LSM. The combination of BPF-LSM and AppArmor provided > 95% coverage of the platforms for KubeArmor.

BPF-LSM essentially allowed KubeArmor to convert user-specified policies into eBPF bytecode that could then be injected at LSM hooks using BPF-LSM. This provided sound security principles and the best coverage. This level of programmability for security use cases fundamentally changed how enforcement could be handled. KubeArmor no longer needed to be operating in the confines of AppArmor or SELinux-based policy language (such as CIL-Common Intermediate Language) constraints.

BPF-LSM is fundamentally changing how workload hardening is done. Just the way, iptables was replaced by eBPF based rules engine, BPF-LSM could replace AppArmor and SELinux-based security hardening. However, unlike AppArmor and SELinux, BPF-LSM is a stackable LSM i.e., it can operate alongside AppArmor and SELinux thus providing multi-layer defense. The way we see it, most of the distributions would continue to offer host/node-based hardening policies out of the box as part of the distribution itself and KubeArmor would be used to tackle the problem of workload hardening or application hardening for pods or container images (deployed directly using docker on the host).

With BPF-LSM, the coverage for KubeArmor increased multifold, and we were able to support EKS, GKE, AKS, and OpenShift, out of the box.

KubeArmor: Why did we open-source it?

We had first-hand witnessed the power of open-sourcing with Cilium. When we began working on KubeArmor we realized that it was a radically different approach and if it has to fail, it has to fail fast. Furthermore, we at AccuKnox fundamentally believe that any core security engine has to be open-sourced from the security point of view itself. As an organization, we did not want to maintain multiple forks of KubeArmor since we had a clear view as to what additional tooling we could build to simplify the user journey when they have to manage multiple k8s deployments. This additional tooling became part of the enterprise solution where policy discovery/recommendation and channel integration support was provided to the users.

Inline/Pre-emptive Mitigation vs Post Attack Mitigation

One of the core KubeArmor differentiators is its capability to do inline mitigation. With KubeArmor, one can specify rules that allow certain behaviors and deny everything else. When it comes to mitigation/enforcement, most of the other solutions such as Falco, and Tetragon follow the model of detect-and-respond i.e., if a process does something it is not supposed to do or if an unknown process is executed, the engine kills the process or quarantines the pod/node itself.

Tetragon leverages bpf_send_signal() primitive to send a kill signal to the process from the kernel space itself. Using a Tetragon policy one can specify a rule that says, if a specific system call is called with certain parameters then send a kill signal to that process. As an example, if the user does not want any process to access a specific file, then one can set a rule saying, that if syscall file_open is executed with the corresponding file path, then send a kill signal to the process. The important point to note here is that the detection of the malicious event or the system call and the sending of the kill signal happens from the kernel space itself. However, this would still fall under the auspices of post-attack mitigation and the attacker could still leverage the model to its benefit as detailed by Grsecurity in their article. The team at Elastic has also pointed out similar deficiencies of using bpf_send_signal() in their blog post here.

With KubeArmor, the process itself is not allowed to be executed in the first place thus not allowing the attacker to execute their code in the target environment. KubeArmor achieves this using the sound security fundamentals of Linux Security Modules (LSMs).

In the above example, we see that a malicious process is getting executed. Falco leverages eBPF system call kprobes to identify the malicious process and as a response, it can send a kill signal from the userspace.

In the case of Tetragon, the user can specify a rule stating if a process execve syscall event is seen with the malicious process name, then send a bpf_send_signal() from Tthe kernel space itself.

KubeArmor leverages Linux security modules, applies policy checks before the process is executed and KubeArmor ensures that the policy check returns an EPERM signal when the process name matches the rule resulting in the execve failing with “Permission Denied”. Thus, in this case, the process never is executed in the first place.

Real-world use-cases

Securing Secrets managers

Inline mitigation as provided by KubeArmor is important when dealing with ransomware attacks. A post-exploit mitigation strategy won’t work since by the time a remedial response is taken, the ransomware might have already encrypted the assets or even worse deleted the assets. Thus protecting against ransomware attacks requires inline mitigation that ensures that the attacker is never allowed to execute their processes in the target environment and that sensitive asset accesses are never allowed.

The detailed case study for Hashicorp Vault hardening and protecting it against ransomware attacks can be found here (Youtube video).

Protecting Jupyter Notebooks

Organizations leverage Juptyer notebooks to provide their users with programmatic access to their sensitive data assets. While this is a great model, attackers, or unethical users might take undue advantage of this model. The very nature of the Juptyer notebook is to allow users to do remote command/code execution. KubeArmor provides a way to ensure that appropriate guardrails are put in place that deny users from executing unknown/unauthorized binaries in the notebooks. With KubeArmor, administrators can lock in the notebook execution to allow controlled access to sensitive data, and controlled process execution access.

The sample threat model shows multiple attack vectors for the Jupyter Notebook application deployed in k8s model. The sample KubeArmor policy shows simple rules that can protect the target Jupyter Notebook environment and allow only authorized access. KubeArmor’s inline mitigation strategy ensures sound security principles are used to enforce the security rules. The detailed demonstration and presentation can be found here.

Edge/IoT, 5G ORAN security use-cases

K8s became a popular workload orchestration engine not only in the cloud-native environments but also in edge, and 5G scenarios. The open-source nature of KubeArmor resulted in interesting scenarios adopted by different users in edge/5G scenarios. The role of k8s in fulfilling the disaggregated nature of 5G deployments such as ORAN turned out to be crucial. While disaggregation and democratization of 5G RAN are great, the primary tradeoffs are security and performance. The ability to do systems hardening using sound security principles provided by LSMs, with minimal performance impact (due to the in-kernel nature of LSMs), and doing it seamlessly across multiple platforms (x86, ARM) turned out to be conducive factors for the adoption of KubeArmor in Edge and 5G deployments.

Phil Porras (Cofounder, Chief Scientist, AccuKnox) spotted this trend early on, and with his extensive background in security and telecom networks, he steered the developments in 5G/Edge verticals. Spearheaded by Phil, AccuKnox (in collaboration with SRI and Ohio State University) got selected for the NSF accelerator program that furthered the goal of 5G security.

KubeArmor was natively integrated with edge orchestration platforms such as Open Horizon, and Intel Smart Edge, and became the de facto security engine for Open Horizon.

How is AccuKnox leveraging KubeArmor?

KubeArmor is a robust and comprehensive policy engine providing not only observability, and container forensics, but policy enforcement as well. Policy enforcement requires a set of policies to be discovered. This requires handling the telemetry in an efficient and scalable way. AccuKnox has implemented a bunch of tooling on top of KubeArmor that handles telemetry in the most performant way and provides a default set of policies custom-made for your workloads. Furthermore, the compliance aspects such as CIS, STIGs, NIST, and MITRE-based policies are auto-recommended as part of the enterprise solution.

Thus at the core, our vision is that KubeArmor will provide policy enforcement and detailed container telemetry, and the AccuKnox enterprise solution will simplify the user journey towards identifying least-permissive Zero Trust policies and satisfy compliance framework out of the box.

AccuKnox has also added support for ECS, Fargate, and other advanced deployment models that are not available in core KubeArmor itself. Furthermore, deployments having a mix of Virtual Machine based workloads and k8s orchestrated can be much more efficiently managed using the AccuKnox Enterprise solution.

KubeArmor support for various platforms

KubeArmor has been tested on the most widely used platforms such as:

  1. EKS (Amazon Linux and Bottlerocket)
  2. GKE (Container optimized OS and Ubuntu)
  3. Azure (Mariner and Ubuntu)
  4. OpenShift (RHEL)
  5. IBM k8s service
  6. Oracle Kubernetes service
  7. … and many more … Please check the support matrix.

Apart from supporting Kubernetes-based deployments, KubeArmor also supports non-orchestrated but containerized deployments and pure Virtual Machine or Bare Metal based deployments as well. A Linux Foundation project called Open Horizon natively integrates with KubeArmor to provide security for edge deployments both in k8s orchestrated mode and pure-containerized mode.

Summary, Key Takeaways

Most of the instrumentation that exists around Runtime Security today focuses on getting runtime visibility, and observability using eBPF. There are a few engines that provide enforcement in the form of a detect-and-respond model. KubeArmor is very unique and differentiated in that it provides inline mitigation in its true sense by leveraging appropriate Linux kernel primitives. KubeArmor aims to simplify runtime security enforcement that operates out-of-the-box on modern workloads (such as k8s) and ensure that it does not provide a false sense of security by using sound security fundamentals. The CNCF endorsed open source model has helped KubeArmor gain its foothold on various platforms and community effort has helped its adoption and extensions. We are very excited about the potential leverage that eBPF provides in our goal to deliver Zero Trust security across Cloud (Public, Private, Air-gapped), Edge/IoT and 5G areas.

You cannot secure what you cannot see.

Your most sensitive information is stored on endpoints and in the cloud. Protect what is most important from cyberattacks. Real-time autonomous protection for your network's edges.

Ready to get started?

BOOK A DEMO