Containers

Multi-Region Disaster Recovery with HAQM EKS and HAQM EFS for Stateful workloads

Introduction

HAQM Elastic File System (EFS) is a managed storage service that can be used to provide shared access to data for Kubernetes Pods running across compute nodes in different Availability Zones (AZ) managed by HAQM Elastic Kubernetes Service (EKS). HAQM EFS supports native replication of data across AWS Regions. This feature helps in designing a multi-Region disaster recovery (DR) solution for mission-critical workloads on your EKS clusters.

The Container Storage Interface (CSI) is a standard that enables you to expose various storage systems to your Pods. This allows you to run stateful workloads on your Kubernetes clusters. CSI accomplishes this task by mounting Persistent Volumes (PV) to the Pods and by keeping the lifecycle of Pods completely independent from the lifecycle of the PVs. The HAQM EFS CSI Driver is a solution that configures Kubernetes PVs in the form of access points in the EFS file system. Developers use Persistent Volume Claims (PVC) to mount the persistent volumes to the Pods.

In this post we discuss how to achieve business continuity in AWS by using HAQM EFS and HAQM EKS across AWS Regions. The solution we propose corresponds to the Pilot light and Warm standby strategies defined in the Disaster Recovery of workloads on AWS whitepaper.

Challenges

HAQM EFS implements a unique access point for each PV in a Kubernetes cluster. A File System Access Point (FSAP) ID must be specified for each volume.

  • The FSAP can be manually defined for each PV using Static Provisioning. However, this can become time consuming for developers or storage administrators, as each access point must be created in HAQM EFS prior to creating a PVC in Kubernetes.
  • Dynamic Provisioning enables developers to create PVCs without the need to have a provisioned access point in advance as it creates access points on-demand using the file system ID of the EFS file system.

Although dynamic provisioning makes the developer experience much better, there is a challenge when using HAQM EFS replication. HAQM EFS replication replicates all of the data in the file system but not the FSAPs. Therefore it limits you to use static provisioning in each EKS cluster. This results in having complex DR runbooks. Because whenever you want to failback to the primary Region, you need to reconfigure the unique FSAP IDs in the PVs manually in that Region. This is really hard to maintain and error-prone. Therefore, we need a solution that is abstracted from the Kubernetes layer but still can make sure of shared access to the same dataset from any EKS cluster.

Solution overview

The solution uses two AWS Regions, with each having an EKS cluster and an EFS file system. We use the EFS CSI driver in each EKS cluster. We kept HAQM Route 53 and its routing policies to route client requests in this multi-Region architecture out of the scope of this post for simplicity.

Figure 1: Solution overview

Figure 1: Solution overview

As shown in the preceding figure, we use two AZs per AWS Region providing further resiliency in the architecture. We use HAQM EFS replication to replicate data natively from Region 1 as source to Region 2 as destination. Once the initial replication completes, the data in the destination file system is read-only. Each EKS worker node accesses the EFS file system (within the same Region) through the AZ specific HAQM EFS mount target.

We achieve the abstraction between HAQM EKS and HAQM EFS layers by configuring a Kubernetes StorageClass object in each EKS cluster. You must specify the EFS file system ID of the respective Region in the StorageClass object.

You may be already thinking “So what is new? This is how you integrate HAQM EFS and HAQM EKS using the EFS CSI Driver ?” When using the StorageClass with HAQM EFS replication, there is a new parameter called subPathPattern, which is introduced in the 1.7.0 version of the EFS CSI Driver. It enables you to provide shared access to the same dataset in two different EFS file systems even though each file system has a distinct FSAP ID for that dataset. Let’s look at how it works.

In the StorageClass object manifest, you configure the subPathPattern, as shown in the following, with the PVC’s name and PVC’s namespace variable. The pattern can be made up of fixed strings and limited variables. The following pattern enables you to start using the exact same Kubernetes Deployment, Pod, and PVC manifests for both primary and DR Regions. You do not need to embed AWS Region-specific configuration parameters, such as the EFS file system ID and/or FSAP ID, into your workload manifests. All of that is abstracted by the EFS CSI Driver, which is really cool!

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: efs-sc
provisioner: efs.csi.aws.com
parameters:
  provisioningMode: efs-ap
  fileSystemId: fs-92107410
  directoryPerms: "700"
  gidRangeStart: "1000" # optional
  gidRangeEnd: "2000" # optional
  basePath: "/dynamic_provisioning" # optional
  subPathPattern: "${.PVC.namespace}/${.PVC.name}" # optional
  ensureUniqueDirectory: "false" # optional
  reuseAccessPoint: "false" # optional

There is one more parameter that you need to define in conjunction with the

subPathPattern, and it is ensureUniqueDirectory 

By setting this parameter to false we make sure that the HAQM EFS driver points the PV in each AWS Region to the same directory in the EFS file system.

Next we deploy our workloads (essentially Pods) and use PVC requests.

It is worth mentioning that using GitOps for this type of architecture offers you the ability to manage the state of multiple Kubernetes clusters and EFS file systems using the DevOps best practices, such as version control, immutable artefacts, and automation.

Failover and failback

In the event of a primary AWS Region failure, you must first convert the EFS file system in the DR Region (destination) from read-only to writeable. You achieve this by simply deleting the replication configuration of the EFS file system.

We recently introduced failback capability for HAQM EFS replication. When you decide to failback to the primary Region you first replicate the recent data on the DR Region back to the primary Region’s EFS file system. Once the replication completes you can then delete the replication configuration on the EFS file system in the primary Region, essentially converting it from read-only to writeable. Finally you configure replication again to make the primary Region the new source file system of the HAQM EFS replication.

Refer to the File system failover and failback section in the HAQM EFS user guide for more information.

Code sample

We have created a GitHub repository where you can deploy the solution in this post. We walk you through the implementation steps and guide you on how to perform the failover and failback operations. The code sample is for demonstration purpose only. It should not be used in production environments. Refer to HAQM EKS Best Practices Guides and Encryption Best Practices for HAQM EFS to learn how to run production workloads using HAQM EKS and HAQM EFS.

Considerations

  • HAQM EFS replication maintains an RPO of 15 minutes for most file systems. You need to factor this in when designing your application against any type of transaction or state recovery. For more information on RTO and RPO read this AWS Storage post and the Replicating file systems section in HAQM EFS User Guide.
  • You can specify user ID (uid) and group ID (gid) in the StorageClass to enforce user identity on the EFS access point. If you don’t then a value from the gidRange in the StorageClass is assigned. If you do not specify the gidRange in the StorageClass then a value, as the uid and gid, selected by the driver is assigned. This is explained further in the sample GitHub repository.
  • When you deploy a workload, which uses a new PVC in the primary Region, you need to wait for the HAQM EFS initial sync to be completed for that data set before deploying the same workload in the secondary Region.
  • Deleting the replication configurationtakes several minutes, keep this in mind when planning your operations and the target RTO.
  • Each PVC consumes an HAQM EFS access point. Check the current HAQM EFS quotas and limits before making design decisions.

Conclusion

In this post we showed you how to use HAQM EFS replication across AWS Regions for your stateful workloads running on EKS clusters, and also how to achieve disaster recovery in that kind of architecture. HAQM EFS CSI Driver provides a simple solution for replicating persistent volumes for Kubernetes across AWS Regions, enabling stateful workloads to both failover and failback.