Embedded Cluster shutdown and startup procedures

This guide provides the correct procedures for safely shutting down and starting up Embedded Clusters. These procedures apply to both single-node and multi-node deployments, including High Availability (HA) configurations.

Following these procedures ensures system stability, maintains data integrity, and minimizes unplanned downtime during maintenance operations. The Embedded Cluster uses systemd units for service management, which handle automatic shutdown and startup processes during normal operations.

General guidelines Copied

Pre-maintenance support bundle generation Copied

Note

Always generate and save a support bundle before performing any maintenance operations. This is a critical step that provides essential diagnostic information for troubleshooting potential issues that may arise during or after maintenance.

Before proceeding with any shutdown or maintenance procedure:

  1. Generate a support bundle from the KOTS Admin Console.
  2. Upload the support bundle to a secure location such as:
    • A dedicated server accessible to your team
    • An administrator’s PC or laptop
    • A network-accessible storage location

You must never skip this step, as the support bundle captures the last known good state of your system before maintenance activities begin.

Automatic service management Copied

The Embedded Cluster requires minimal manual intervention. The cluster is managed through systemd units that automatically handle proper service shutdown and restart during node reboots. In most scenarios, manual stop or start operations are unnecessary.

Benefits of automatic management:

Clean reboot procedures Copied

Always perform clean system reboots when maintenance requires a restart:

sudo systemctl reboot

Warning

Avoid hard reboots (power button, forced shutdowns) as they can lead to data corruption, service instability, extended recovery times, or potential cluster state inconsistencies.

Single-node clusters Copied

No additional procedures are required for single-node clusters beyond performing clean reboots with systemctl reboot.

Since all services, worker processes, and applications run on the same node, shutting down this node will make the entire cluster unavailable until restart is complete and all services are restored.

Multi-node and High Availability clusters Copied

Multi-node deployments, particularly in High Availability (HA) configurations, require coordinated maintenance procedures to preserve cluster availability and maintain service redundancy.

Worker nodes Copied

Control-plane nodes Copied

Control-plane nodes provide essential cluster management services including API coordination, scheduling decisions, and maintaining distributed cluster state.

Proper sequencing prevents quorum loss, which would otherwise disable cluster management functions and potentially impact all running workloads.

Best practices Copied

Multi-node specific practices Copied

Proper Embedded Cluster shutdown and startup procedures are essential for maintaining system reliability and data integrity. The automated service management through systemd units provides efficient operation with minimal administrative overhead.

Manual service management Copied

Manual stopping or starting of cluster services is possible but generally discouraged. Use this approach only when advised by ITRS Support, as incomplete or incorrect service management can leave the cluster in an unhealthy state.

Service stop commands Copied


sudo systemctl stop k0sworker
sudo systemctl stop k0scontroller
sudo systemctl stop local-artifact-mirror

Service Start Commands Copied

sudo systemctl start k0sworker
sudo systemctl start k0scontroller
sudo systemctl start local-artifact-mirror
["ITRS Analytics"] ["Troubleshooting", "Maintenance"]

Was this topic helpful?