AWS Marketplace

Deploy HAQM EKS advanced data protection using NetApp Trident AWS Marketplace EKS add-on and HAQM FSx for NetApp ONTAP

NetApp Trident for HAQM FSx for NetApp ONTAP is available as an AWS Marketplace add-on for HAQM EKS. This solution simplifies the subscription and deployment processes by automating the initial installation and configuration of NetApp Trident on HAQM EKS clusters.

When using NetApp Trident together with FSx for ONTAP, it provides not only basic Container Storage Interface (CSI) storage, but also advanced data management features such as HAQM EKS inter-cluster PersistentVolumeClaim (PVC) mirroring for disaster recovery (DR), creating instant snapshots for data protection, and in-place snapshot restore. Because these features are available as Kubernetes custom resources, you can manage them through HAQM EKS tools and incorporate them into GitOps processes as part of the application lifecycle management to simplify deployment.

This post shows you how to deploy Trident using AWS Marketplace add-on, set up data replication between two HAQM EKS clusters with two FSx for ONTAP file systems, and test failover.

Prerequisites

The following prerequisites are required to set up this demo environment:

  1. Create two EKS clusters in the same or different AWS Regions as primary and DR clusters.
  2. Enable both HAQM EKS clusters for AWS Identity and Access Management (IAM) OpenID Connect (OIDC) provider to configure IAM roles for service accounts.
  3. Create two FSx for ONTAP file systems as primary and DR.
  4. Create NetApp SnapMirror peerings between primary and DR FSx for NetApp file systems and storage virtual machines (SVMs).
  5. Install eksctl and configure your local host to access both EKS clusters.
  6. Set environments variables (as shown in figure 1) with the Primary and DR EKS clusters names:
kubectl commands to remove the test enviornment

Figure 1: Shell script demonstrating AWS EKS high availability multi-cluster setup

  1. The following AWS Marketplace permissions are required:

"aws-"marketplace:ViewSubscriptions",
"aws-marketplace:Subscribe",
"aws-marketplace:Unsubscribe"</code

  1. Use the following command (as shown in figure 2) to create different secrets using AWS Secrets Manager for each FSx for ONTAP SVMs:
AWS secret creation script defining Trident storage credentials with JSON-formatted authentication parameters

Figure 2: AWS CLI command creates a secret in AWS Secrets Manager

Solution overview

The solution consists of four high-level steps (as shown in figure 3):

AWS high availability architecture depicting EKS clusters, Trident CSI, and FSx storage replication between primary and DR sites

Figure 3: HAQM EKS PVC replication using NetApp Trident and HAQM FSx for NetApp ONTAP

  1. Install Trident EKS add-on on both primary and DR HAQM EKS clusters
  2. Configure Trident on both EKS clusters
  3. Create source PVC and Trident Mirror Relationship (TMR) replication relationship on the primary EKS cluster
  4. Create corresponding TMR and destination PVC on the DR EKS cluster

Solution walkthrough: Deploy HAQM EKS advanced data protection using NetApp Trident AWS Marketplace EKS add-on and HAQM FSx for NetApp ONTAP

The following steps walk you through the process to deploy Trident and set up data replication using a combination of the AWS CLI and eksctl.

Install Trident EKS add-on on both primary and DR HAQM EKS clusters.

  1. Create IAM policy

Using the following template (as shown in figure 4) to create a policy.json file, which sets the parameters for the required IAM policy:

AWS IAM policy document specifying FSx management actions and Secrets Manager retrieval permissions with ARN scoping

Figure 4: policy.json contains AWS IAM permissions for FSx operations and Secrets Manager access

Run the following command (as shown in figure 5) to create the policy:

AWS IAM policy creation command defining NetApp Trident access to FSxN and Secrets Manager services

Figure 5: AWS CLI command creates an AWS IAM policy for Trident CSI driver integration

  1. Create an IAM role

Run the following command (as shown in figure 6) to create IAM roles for service accounts with a custom role name. In this example, it is <HAQMEKS_FSxN_CSI_DriverRole>:

eksctl create command establishing IAM service account with HAQM FSx permissions in EKS cluster namespace

Figure 6: Command creates an AWS IAM service account using eksctl

  1. Install Trident add-on on both EKS clusters

Create an add-on.json file (as shown in figure 7) for capturing add-on set up parameters. Update the clusterName with your primary and DR EKS cluster names. Update serviceAccountRoleArn with your role’s HAQM Resource Name (ARN), created in the previous step. In this example, the ARN is <arn:aws:iam::<account_ID>:role/HAQMEKS_FSXN_CSI_DriverRole>. Update configurationValues with the same ARN at the end of the string value:

add-on.json configuration file linking NetApp Trident operator to EKS FSx driver role and cloud identity

Figure 7: add-on.json showing cluster settings for Trident operator add-on with service account role specifications

Run the following command (as shown in figure 8) to install Trident on both primary and DR HAQM EKS clusters:

AWS EKS add-on creation command with JSON input file reference

Figure 8: AWS CLI command to create an EKS add-on using add-on.json as configuration file

Verify Trident add-on deployment status by checking the current version through the eksctl command (as shown in figure 9). Perform this step on both primary and DR EKS clusters, replacing primary/DR-Cluster with the cluster name respectively.

eksctl command retrieving NetApp Trident operator addon for Primary/DR cluster

Figure 9: eksctl command gets the Trident operator add-on for a specified cluster

You should expect the output shown in figure 10:

Trident operator add-on output

Figure 10: Output of the Trident operator add-on

Configure Trident on both primary and DR EKS clusters

  1. Configure both FSx for ONTAP file systems.

Create separate backend-config.yaml using the SVM credentials (username and password) stored in AWS Secrets Manager, as mentioned in the seventh prerequisite. Do this for both primary and DR EKS clusters and FSx for ONTAP file systems, respectively.

In the following example YAML file (as shown in figure 11), you need to update both fsxFile systemID, <fs-xxxxxxxxxx> and <region> with your file system IDs and AWS Regions.

TridentBackendConfig YAML defining ONTAP NAS storage with AWS credential

Figure 11: backend-config.yaml defines a Trident backend storage configuration using ONTAP driver

Run the following commands (as shown in figure 12) on both EKS clusters:

kubectl commands switching contexts and creating backend config across EKS clusters

Figure 12: kubectl commands create backend configurations using backend-config.yaml for both primary and DR clusters

  1. Configure Kubernetes StorageClass objects for both the primary and DR EKS clusters.

Using the following storageclass.yaml example (as shown in figure 13), create the YAML files and run the command from both primary and DR EKS clusters:

StorageClass YAML defining Trident CSI storage with volume expansion and reclaim policies

Figure 13: storageclass.yaml defines StorageClass using the NetApp Trident provisioner for FSx ONTAP

Run the following commands (as shown in figure 14):

kubectl commands creating StorageClass in primary and DR EKS clusters

Figure 14: kubectl commands create StorageClass configurations using storageclass.yaml

Create source PVC and TMR replication relationship on the primary HAQM EKS cluster.

  1. Create the PVC.

On the primary EKS cluster, create a PVC by using the following YAML file (as shown in figure 15) as an example. You can change the PVC name and storage size to meet your requirements:

Complete Kubernetes PersistentVolumeClaim configuration showing storage specs, access modes, and CSI integration

Figure 15: pvc.yaml defines a PVC with basic configurations including storage size, ReadWriteMany access and basic-csi storage class

Run the following commands (as shown in figure 16) to run the PVC creation:

kubectl commands to set context, create PVC from YAML, and list PVCs

Figure 16: kubectl commands create PVC from pvc.yaml, and retrieve PVC information

Check the PVC status on the primary EKS cluster, the following is the expected output (as shown in figure 17):

PVC information from pervious kubectl commands

Figure 17: PVC information

  1. Create the TMR.

After the PVC is created on the primary EKS cluster, use the following example to create the mirrorsource.yaml (as shown in figure 18) file for the TMR between the primary and DR PVCs.

NetApp Trident YAML defining mirror relationship with promoted state and PVC storage mapping

Figure 18: mirrorsource.yaml defines a TridentMirrorRelationship resource with promoted state

After you have the mirrorsource.yaml file ready, run the following commands (as shown in figure 19):

kubectl commands showing Trident mirror source deployment with context switching

Figure 19: kubectl commands configure a Trident mirror relationship from mirrorsource.yaml

Check the TMR on the primary HAQM EKS cluster. the following is the expected output (as shown in figure 20):

TMR in promoted state

Figure 20: TMR in promoted state from primary EKS cluster

Get the FSx for ONTAP local volume handle for your PVC from the primary HAQM EKS cluster by running the following command (as shown in figure 21).

kubectl command extracting local volume handle from PVC using custom jsonpath format

Figure 21: kubectl command uses jsonpath to extract local volume handle

The output should look like the following example, with your FSx for ONTAP SVM name and volume ID. Document these values as you will use them in the next step.

[pvc-storage ,<fsxn_svm>:<volume_id>]

Create corresponding TMR and destination PVC on the DR HAQM EKS cluster.

  1. Create TMR on DR HAQM EKS cluster

Create the TMR on the DR EKS cluster using mirrordest.yaml manifest (as shown in figure 22). Make sure you update all the localPVCName (such as pvc-storage, from the previous step) and remoteVolumeHandle, such as the following example:

<fsxn_svm>:<volume_id>

NetApp Trident mirrordest.yaml configuration showing established PVC storage mapping with FSXN volume handle

Figure 22: mirrordest.yaml establishes storage mirroring relationships

Run the following commands (as shown in figure 23) to create the TMR on the DR HAQM EKS cluster:

kubectl commands create mirror destination

Figure 23: kubectl commands creating mirror destination using mirrordest.yaml

You should get the following expected output (as shown in figure 24):

TMR created successfully

Figure 24: TMR created successfully

  1. Create the PVC on the DR HAQM EKS cluster and start data replication.

The final step is to create the PVC on the DR EKS cluster and start the replication. Make sure you update the TridentMirrorRelationship metadata annotation with the TMR name established in the previous example. In the following example (as shown in figure 25), it is set to pvc-storage.

pvcdest.yaml defining PVC with ReadWriteMany access, Trident mirror settings, and NAS storage class

Figure 25: pvcdest.yaml defines storage claim with ReadWriteMany access

You can now run the following commands (as shown in figure 26) to create the PVC:

kubectl commands set context, create PVC, and verify mirror status

Figure 26: kubectl commands set context, create PVC, and verify mirror status

At this point, the TMR will be established after the destination PVC is created. You can check the TMR status by comparing to this expected output (as shown in figure 27):

TMR in establised state

Figure 27: TMR in established state

The HAQM EKS environment setup and configuration are now completed. Wait a few minutes for replication to catch up and finish, after which you will have a pair of fully DR-ready HAQM EKS PVCs.

For the next stage, manually trigger a failover to simulate a DR scenario, for testing purposes.

How to manually trigger a failover from the DR HAQM EKS cluster

Run the following steps to simulate a DR process in your Primary EKS cluster and move both application and data into the DR cluster.

Using your test environment, you can stop the data replication and promote the PVC on the DR EKS cluster to ReadWrite and mountable. Using the following mirrordestdr.yaml manifest as an example (as shown in figure 28), you can change the TMR from “established” to “promoted”, which is what you need to turn your DR HAQM EKS cluster into primary.

mirrordestdr.yaml defining promoted state for Trident mirror relationship in disaster recovery

Figure 28: mirrordestdr.yaml declares a TMR with promoted state

Run the following command (as shown in figure 29) to trigger the failover:

kubectl command triggers a cluster failover

Figure 29: kubectl command triggers cluster failover

Note: It’s important to validate that the state of the TMR changes to “promoted”. Sometimes the status might be “promoting” instead because the failover takes time. This is normal; wait a few minutes and try again if it is necessary.

To get the state of the TMR, use the following command:

kubectl get tmr

The following is the expected output (as shown in figure 30):

TMR in promoted state

Figure 30: TMR in promoted state

Congratulations! You have successfully set up a DR for your HAQM EKS clusters and validated the disaster recovery by switching over to the DR cluster.

At this point, your DR EKS cluster is in production and can host your production workload. Once your primary EKS cluster and FSx for ONTAP file system are ready to be operational, you can perform the replication step in the reverse order to fail back to your original primary site.

Cleanup

To remove the test environment, run the following commands (as shown in figure 31):

kubectl commands to remove the test enviornment

Figure 31: kubectl commands remove the test environment

Conclusion

HAQM EKS with NetApp Trident and FSx for ONTAP provides robust data protection and disaster recovery capabilities for containerized workloads on AWS. The AWS Marketplace add-on simplifies deployment, allowing DevOps engineers to set up and manage their environment using familiar HAQM EKS tools.

This post demonstrated how to deploy Trident and configure a Trident Mirror Relationship between two HAQM EKS clusters using AWS CLI, eksctl, and kubectl. To get started, visit NetApp Trident in AWS Marketplace. For Trident documentation, visit the Trident documentation repository.

About Authors

Michael Shaul

Michael Shaul

Michael Shaul is a Principal Architect at NetApp’s office of the CTO. He has over 20 years of experience building data management systems, applications, and infrastructure solutions. He has a unique in-depth perspective on cloud technologies, builder, and AI solutions.

Bhavin Shah

Bhavin Shah

Bhavin Shah is a Principal Product Manager with NetApp. He has over a decade of experience in data management and has spent more than five years working with Kubernetes and helping organizations build data protection and disaster recovery solutions for containers running in production.

Eric Yuen

Eric Yuen

Eric Yuen is a Senior Partner Solutions Architect with AWS. He works closely with AWS Storage Partners building solutions and helps customers design storage environments on AWS. Eric brings 20 years of industry experience working with different storage and data protection technologies.