Integration between Cilium and SPIFFE – Part 1
This is the first part of a series of blog posts about the integration between
the Cilium and SPIFFE projects. This part explains the current Cilum identity
model and how it could be extended to be used along with SPIFFE – which
provides a universal identity control plane for distributed systems. In the
next parts, we are going to explain in more detail this integration and the
real case challenges that we want to solve.
Introduction
Cilium is an open-source project to provide networking, security, and
observability for cloud-native environments such as Kubernetes clusters and
other container orchestration platforms [1]. Cilium uses eBPF which is a Linux
kernel technology that allows dynamic inserts of a program (called eBPF
program) to be safely executed into Linux kernel. Cilium operates as a CNI
(..Container Networking Interface) running in each node of the cluster.
Cilium identity model
When a pod or container is created, Cilium generates an endpoint – which
logically represents the pod/container that was created. Based on the k8s
labels associated with the endpoint created by Cilium, an Identity is derived.
An identity is a unique number that is going to be mapped to a set of k8s
labels. From now on, this numeric identity is going to be used on eBPF control
plane (at Linux kernel level) and will be used to do policy enforcement at
L3/L4 authorization per-packet basis. The identity mechanism improves
scalability and performance when compared with policy enforcement based on
network addressing such as IPtables.
The identity number is shared and synchronized widely in the cluster using
KVStore and can be used by any node of the cluster to do policy enforcement in
an appropriate way. In Figure 1, we can see an example of a table that maps a
set of k8s labels with the respective identity number. This map is shared
through all the three nodes (A, B, and C) of the cluster where each node runs
an instance of Cilium.
Using this identity-based approach, it’s possible to implement network
security without dependence on network addresses for flexibility and
scalability reasons, which is another big advantage considering the cloud
environment. When the number of pods or nodes increases, the number of rules
that are used for policy enforcement does not increase, because now it’s
possible to group the pods based on the labels (and consequently based on the
identity number). In the same example of Figure 1, if we add a new pod with
the same set of labels (role=frontend), Cilium will use the same identity
(number 10) for this pod, which was previously generated for another pod
(based on the same labels).
One of the current drawbacks of the identity mechanism used by Cilium is that
it’s limited only to Kubernetes clusters and other container orchestration
platforms, and here is where SPIFFE comes into the picture. Using SPIFFE,
which is a universal identity control plane and has strong identity
attestation procedures, it’s possible to support identity/trust across
different platforms/cloud providers [3]. In the next section, we are going to
explain all the current limitations present in Cilium and how we can face
these challenges using SPIFFE.
Extend Cilium identity model using SPIFFE
SPIFFE (Secure Production Identity Framework for Everyone) contains a set of
specifications that cover how a workload should retrieve and use its identity
[4]. If your security identity model is based on SPIFFE, it is possible to
guarantee a trust model between workload/services running in different
platforms, cloud providers, or even in different edge devices, which are also
based on SPIFFE. SPIRE (SPIFFE Runtime Environment) is an implementation of
the SPIFFE APIs that performs platform and workload attestation in order to
securely issue SVIDs (SPIFFE Verifiable Identity Document) to workloads and
verify the SVIDs of other workloads, based on a predefined set of conditions
[5].
In Figure 2 the basic components of identity are divided into four groups and
related to both projects. This division helps us to understand how each
project approaches each component of identity and what are the advantages of
this integration.
-
Identity Attributes & Attestation: Every Identity system depends on a
set of attributes and attestation of those attributes. The attestation
procedure ensures that the enlisted attributes indeed belong to the workload
which claims it. Cilium does not employ explicit workload attestation
procedures and only k8s-labels are used to calculate the endpoint identity –
Kubernetes control-plane takes care of label management and Cilium just uses
it. On the other hand, SPIFFE provides a strong mechanism to perform
identity attestation and can use the Kubernetes plugin [6] to attest
k8s-labels or use other information such as location (node), container/pod
names, container/pod images to compose the endpoint identity. Using the
SPIFFE identity mechanism, Cilium is capable of deriving the endpoint
identity in a more secure way and the attestation procedures can be based on
an extensive attribute list [7]. -
Identity Mapping: The set of attributes may be mapped to an intermediate
representation which essentially serves as “the ID”. Cilium maps the numeric
identity to k8s-labels whereas the SPIFFE workload attributes are mapped to
SPIFFE ID which is in the form of a URI (Uniform Resource Identifier). An
example of SPIFFE ID is “spiffe://acme.com/billing/payments”. The Identity
is a document called SVID which essentially is an X.509 signed certificate
with few mandatory fields such as the presence of SAN (Subject Alternate
Name) which carries the SPIFFE ID. -
Identity Carrier: The application network connection needs to communicate
the ID to the remote peer. Cilium uses IPCache which is a mapping table
between pod IP addresses to identity. In this way, a pod knows the identity
of a remote peer. In the case of SPIFFE, a mTLS handshake is used for the
carrier’s identity, and by the end of the handshake, both peers know the
SPIFFE ID which is carried by the part of the certificate. Using the SPIFFE
approach, it’s possible to carry an identity beyond the Cilium’s boundaries
– between a k8s and non-k8s workload for example. -
Identity Derivatives: The Identity attestation procedure might eventually
result in the derivation of other credentials (such as certs or tokens)
which could be used for other applications. The identity composed by Cilium
can be used just by its own, which in this case, is just by CNP (Cilium
Network Policy) for authorization and there isn’t identity derivation. SPIRE
is able to derive X.509 certificates that can be used for mTLS or for
IPSec/WireGuard based authentication procedures. Also, it can be used by JWT
tokens to do policy enforcement based on micro-segmentation [8].
Benefits SPIFFE could provide to Cilium:
-
Strong attestation and authentication procedures for Identity. Strong
cryptographic protection for the identity value. -
Generic Identity solution which extends to non-k8s workloads and to
edge/IoT/endpoint scenarios as well. -
Ability to federate identity across multiple service providers. For e.g, if
one service provider uses Istio based service mesh and another with
Cilium+SPIFFE, it will be possible to federate the identity. -
Ability to use the SVIDs for other purposes such as transparent encryption,
WireGuard/IPSec tunnels. Solve the problem of certificate management in the
right way across all the services/use-cases. -
Single identity across all policy enforcement engines, such as network,
system, and data. -
Ability to extend the identity solution with hardware-based attestation
service using confidential computing (enclaves, TPMs).
Considerations with this integration:
-
The ability of fallback to classic Cilium Identity solution – which maps the
set of k8s-labels of an identity value which is directly used for
authorization. -
No impact (performance or functional) on Cilium data-path handling of
identities. -
Ability to use per-packet identity and mTLS handshake both for
authorization.
A high-level overview of the integration
In Figure 3, it is possible to see a high-level overview of a Kubernetes
environment with Cilium and Spire deployed based on the integration. In step
1, Spire is deployed in the cluster and two registration entries (step 2) are
created for 2 different pods – podfoo and podfefault. After creating the
registry entries in spire-server, they are cached to all spire agents. Steps 1
and 2 are common steps when Spire is deployed in a cluster. The integration
comes into the picture in step 3. When both pods are deployed in the cluster,
Cilium creates an endpoint to represent the pod creation and generate a
numeric identity based on the labels used by the pod. Besides that, Cilium
connects to Spire (though a Delegated Identity API also created with this
integration) and, on behalf of the pods, asks Spire to attest the related
pods’ attributes (step 3.1).
In our example, there are two related entries already created in Spire for
both pods. The selectors of each registration entry match the attributes used
to create podfoo and pod default, and, once it happens, a X.509 SVID is
returned (step 3.2) and Cilium uses this and creates a label for each pod
containing the SPIFFE ID URI. As soon as this label is created, Cilium uses
this new label to compound a numeric identity. Finally, the label created can
be used by a CNP (step 4) to do L3/L4 policy enforcement. Also is possible to
do L7 policy enforcement using the X.509 SVID returned, together with
cilium-envoy proxy, and use it to upgrade a non-secure connection to mTLS.
This example shows an upgrade connection between two workloads running in
Kubernetes, but it’s also possible to use the certificate to upgrade a
connection between a k8s and non-k8s workload.
To sum up, an attestation process happens on behalf of each pod (performed by
Cilium) and the pod receives a new label (SPIFFE ID) based on its own
attributes. Then this new label can be used by a CNP to do L3/L4/L7 policy
enforcement.
In part 2 of this series, we will explain in a detailed way how we did this
integration. How Cilium and Spire were modified to be possible to integrate
both projects. Also, an example of L3/L4/L7 policy enforcement using SPIFFE ID
will be shown together with an example of upgrading a non-secure connection to
mTLS.
References:
[2] https://docs.cilium.io/en/v1.10/concepts/terminology/
[3] https://www.youtube.com/watch?v=0LSaNrOabH4
[4] https://spiffe.io/docs/latest/spiffe-about/overview/
[5] https://spiffe.io/docs/latest/spire-about/spire-concepts/
[6] https://github.com/spiffe/spire/blob/main/doc/plugin_agent_workloadattestor_k8s.md
[7] https://spiffe.io/docs/latest/deploying/registering/#2-defining-the-spiffe-id-of-the-workload
[8]
https://www.accuknox.com/blog/identity-based-micro-segmentation-using-jwt-tokens
Now you can protect your workloads in minutes using AccuKnox, it is available to protect your Kubernetes and other cloud workloads using
Kernel Native Primitives such as AppArmor, SELinux, and eBPF.
Let us know if you are seeking additional guidance in planning your
cloud security program.