Backup and restore
This document outlines the process for backing up and restoring an ITRS Analytics instance, focusing on the use of Velero for disaster recovery.
Overview of backup and restore Copied
To enable backup and restore functionality for an ITRS Analytics, ensure that a remote storage repository is available and the Kubernetes platform supports CSI Volume Snapshots
Velero is the primary tool used to perform both scheduled and manual backups, as well as manual restores. It backs up the following critical Kubernetes resource types:
PersistentVolumeClaim
(PVC) — pointers to your stored data.PersistentVolume
(PV) — actual storage volumes themselves.Secret
— sensitive data like credentials and keys.
KOTS-based deployment
In addition to the specified resource types, the backup also includes additional resources required by KOTS. The KOTS admin console provides a user-friendly interface for executing manual backups and configuring backup schedules.
Note
VolumeSnapshots
provide point-in-time backups per volume but do not ensure consistency across multiple volumes. As a result, in a disaster recovery scenario, some data may be duplicated or out of sync after a restore. This limitation will persist until Velero supportsVolumeGroupSnapshots
and your specific storage provider implementsCSI VolumeGroupSnapshot
support.
Prerequisites for backup and restore Copied
Before you can implement ITRS Analytics backup and restore, ensure the following prerequisites are met:
-
A remote object storage repository (e.g., AWS S3, Google Cloud Storage) must be created and fully accessible to your Kubernetes cluster.
-
The storage classes utilized for Persistent Volume Claims associated with ITRS Analytics workloads must support CSI volume snapshots.
-
Velero 1.16.1 or later must be installed in your Kubernetes cluster with the CSI plugin enabled. Always use the latest Velero agent for access to new features and fixes related to CSI snapshot handling.
-
Select the Enable IAX backup and restore option. ITRS Analytics support backup and restore is enabled by default. You can modify this setting under Advanced Settings > Backup and Restore in the KOTS Admin Console. For more information, see Preflight checks for CSI VolumeSnapshot support in StorageClass.
Checking Velero version Copied
Run the following command to verify the Velero version installed in your cluster.
velero version
You should see output similar to this:
Client:
Version: v1.16.0
Git commit: -
Server:
Version: v1.16.1
The Server
version indicates the Velero version running in your cluster, while the Client
version refers to the Velero CLI tool installed locally on your machine.
Warning
If you have only installed the Velero CLI locally and not yet deployed the Velero server in the cluster, you may receive an error like:
<error getting server version: no matches for kind "ServerStatusRequest" in version "velero.io/v1">
This is expected. You can verify the CLI installation with
velero
version but full output (including server version) will only appear after Velero is deployed in the cluster.
Example installation of Velero with AWS Copied
This section provides a step-by-step guide for setting up Velero with an AWS S3 bucket for storage.
Initial setup on AWS Copied
Ensure the AWS user executing these commands has the necessary credentials for managing IAM (Identity and Access Management) and S3 resources. All subsequent commands make use of environment variables, so run them within the same console session.
-
Create an S3 bucket in your desired AWS region. For optimal performance, this region should ideally be the same as your Kubernetes cluster where ITRS Analytics is installed.
BUCKET=iax-backups REGION=us-east-2 aws s3api create-bucket \ --bucket $BUCKET \ --region $REGION \ --create-bucket-configuration LocationConstraint=$REGION
-
Create an IAM user with read and write access to the S3 bucket.
-
Create the user.
USER=iax-backups aws iam create-user --user-name $USER
-
Create the user policy. This policy grants permissions for EC2 (for volume snapshots) and S3 (for backup storage).
cat > /tmp/policy.json <<EOF { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ec2:DescribeVolumes", "ec2:DescribeSnapshots", "ec2:CreateTags", "ec2:CreateVolume", "ec2:CreateSnapshot", "ec2:DeleteSnapshot" ], "Resource": "*" }, { "Effect": "Allow", "Action": [ "s3:GetObject", "s3:DeleteObject", "s3:PutObject", "s3:AbortMultipartUpload", "s3:ListMultipartUploadParts" ], "Resource": [ "arn:aws:s3:::${BUCKET}/*" ] }, { "Effect": "Allow", "Action": [ "s3:ListBucket" ], "Resource": [ "arn:aws:s3:::${BUCKET}" ] } ] } EOF aws iam put-user-policy --user-name $USER --policy-name $USER --policy-document file:///tmp/policy.json
-
Create an API access key.
aws iam create-access-key --user-name $USER
This command will generate the
AccessKeyId
andSecretAccessKey
for the new API credentials. Be sure to copy these values and use them to replace the placeholders in the commands below.cat > /tmp/creds.txt <<EOF [default] aws_access_key_id=ACCESS_KEY_ID aws_secret_access_key=SECRET_ACCESS_KEY EOF
-
-
Install Velero CLI.
The Velero Command Line Interface (CLI) is installed locally, not on the Kubernetes server. Refer to the official Velero documentation for installation depending on your operating system.
-
Run
velero version
again to confirm both client and server versions. See Checking Velero version. -
Install Velero server components.
You can install the Velero server through the CLI or Helm. Helm allows for greater configuration flexibility, but for simplicity, the CLI method is demonstrated here.
Note
Velero should not be installed in the same Kubernetes namespace as your ITRS Analytics instance. It is highly recommended to use a dedicated namespace, such asvelero
(which is the default).velero install \ --features=EnableCSI \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.11.1 \ --bucket $BUCKET \ --prefix iax-backups/velero \ --backup-location-config region=$REGION \ --snapshot-location-config region=$REGION \ --secret-file creds.txt \ --image velero/velero:v1.16.1 \ --use-node-agent
Explanation of flags:
--use-node-agent
— for KOTS-based deployment backup and restore functionalities.--features=EnableCSI
— enables support for CSI volume snapshots, a critical component for ITRS Analytics backups.--image
— specifies the exact Velero version to be installed.
Backup process Copied
KOTS-based deployment provides two methods for performing backups:
- KOTS Admin Console – allows for both manual backups and the scheduling of automated backups.
- KOTS CLI Tool – supports manual backups via the command line.
For detailed instructions on using either method, refer to the Create and Schedule Backups documentation.
Note
Only Full Snapshots (Instance) are supported.
Verifying backup success Copied
After a backup is triggered, either manually or by schedule, confirm that it completed successfully using the following methods.
Note
It’s recommended to perform these checks periodically to ensure that backups are running reliably.
- Check the Snapshot tab in the KOTS Admin Console to verify the backup status.
- Run the following command in the KOTS CLI:
kubectl kots get backups
- Confirm that the corresponding volume snapshots exist in your cloud provider’s storage console.
Restore process Copied
In a recovery scenario, follow these steps to restore an ITRS Analytics instance to its last backed-up state.
-
Ensure Velero is operational.
- If Velero was compromised as part of the disaster, you will need to reinstall it and point it to the remote repository containing your backup history. Follow the initial installation steps provided, ensuring that the configuration matches the original setup exactly.
- If Velero is still operational, you may skip this step.
-
Use the KOTS CLI tool to list available backups.
kubectl kots get backups
- Identify the appropriate backup to restore, based on the timestamp and successful completion status.
- In your cloud provider’s console, confirm that the corresponding volume snapshots still exist and match the selected backup (check timestamps and tags).
-
Clean up the namespace.
Depending on the current state of your ITRS Analytics installation, it may be necessary to clean up the existing namespace to fully remove the previous instance.
- If the ITRS Analytics namespace (for example,
itrs
) no longer exists, re-create it and proceed to the next step. - Otherwise, uninstall the ITRS Analytics components:
helm uninstall iax -n itrs helm uninstall iax-operator -n itrs
- If needed, manually delete the
Obcerv
resource. If deletion hangs, you may need to remove the resource’s finalizer manually. - Take note of any manually resized
PersistentVolumeClaims
(PVCs). If their sizes differ from those defined in your ITRS Analytics configuration, you will need to adjust the configuration accordingly in a later step. - Delete any remaining PVCs in the namespace.
- If the ITRS Analytics namespace (for example,
-
Restore the necessary Kubernetes components such as secrets, PVCs, and PVs.
- Then initiate the restore using the KOTS CLI. Replace
BACKUP_NAME
with the name of the backup identified when you create the list of available backups.kubectl kots restore --from-backup BACKUP_NAME
- Verify that the restore completed successfully. The output should look similar to:
• Deleting Admin Console ✓ • Restoring Admin Console ✓ • Restoring Applications ✓ • Restore completed successfully.
- Also confirm that secrets and PVCs have been properly restored.
- Then initiate the restore using the KOTS CLI. Replace
-
Reinstall your ITRS Analytics instance.
- Start the KOTS Admin Console.
kubectl kots admin-console
- Then open the admin console in your browser. Navigate to the Applications tab, and redeploy the ITRS Analytics instance. This ensures the application is reinstalled with the correct configuration.
- Start the KOTS Admin Console.