Embedded Cluster shutdown and startup procedures
This guide provides the correct procedures for safely shutting down and starting up Embedded Clusters. These procedures apply to both single-node and multi-node deployments, including High Availability (HA) configurations.
Following these procedures ensures system stability, maintains data integrity, and minimizes unplanned downtime during maintenance operations. The embedded cluster uses systemd units for service management, which handle automatic shutdown and startup processes during normal operations.
General guidelines Copied
Automatic service management Copied
The embedded cluster requires minimal manual intervention. The cluster is managed through systemd
units that automatically handle proper service shutdown and restart during node reboots. In most scenarios, manual stop or start operations are unnecessary.
Benefits of automatic management:
- Ensures all services start in the correct order.
- Maintains service dependencies and health checks.
- Reduces risk of human error during maintenance.
Clean reboot procedures Copied
Always perform clean system reboots when maintenance requires a restart:
sudo systemctl reboot
Warning
Avoid hard reboots (power button, forced shutdowns) as they can lead to data corruption, service instability, extended recovery times, or potential cluster state inconsistencies.
Manual service management Copied
Manual stopping or starting of cluster services is possible but generally discouraged. Use this approach only when absolutely necessary, as incomplete or incorrect service management can leave the cluster in an unhealthy state.
Service stop commands Copied
sudo systemctl stop k0sworker
sudo systemctl stop k0scontroller
sudo systemctl stop local-artifact-mirror
Service Start Commands Copied
sudo systemctl start k0sworker
sudo systemctl start k0scontroller
sudo systemctl start local-artifact-mirror
Single-node clusters Copied
No additional procedures are required for single-node clusters beyond performing clean reboots with systemctl reboot
.
Since all services, worker processes, and applications run on the same node, shutting down this node will make the entire cluster unavailable until restart is complete and all services are restored.
Multi-node and High Availability clusters Copied
Multi-node deployments, particularly in High Availability (HA) configurations, require coordinated maintenance procedures to preserve cluster availability and maintain service redundancy.
Worker nodes Copied
- Worker nodes operate independently and can be rebooted individually without significant cluster impact.
- No coordination between worker nodes is necessary, though clean reboots using
systemctl reboot
remain the recommended procedure for proper service management.
Control-plane nodes Copied
Control-plane nodes provide essential cluster management services including API coordination, scheduling decisions, and maintaining distributed cluster state.
-
In HA deployments, the first three nodes typically function as control-plane nodes.
-
Critical maintenance requirements for preserving cluster availability:
-
One-at-a-time maintenance — reboot control-plane nodes individually to maintain cluster management capabilities.
-
Maintain majority consensus — ensure at least two control-plane nodes remain active and joined to the cluster during maintenance.
-
Avoid concurrent downtime - simultaneous control-plane node outages can trigger cluster-wide service interruption and require complex recovery procedures.
Proper sequencing prevents quorum loss, which would otherwise disable cluster management functions and potentially impact all running workloads.
Best practices Copied
- Use
systemctl reboot
for all planned restarts. - Avoid hard shutdowns to prevent data corruption.
- Do not manually stop or start services, unless absolutely necessary.
- Schedule maintenance during appropriate windows.
- Document maintenance activities for operational records
Multi-node specific practices Copied
- Reboot control-plane nodes sequentially.
- Maintain minimum active control-plane nodes (at least 2 out of 3).
- Verify cluster health after each node maintenance
Proper embedded cluster shutdown and startup procedures are essential for maintaining system reliability and data integrity. The automated service management through systemd
units provides efficient operation with minimal administrative overhead.