= k8s MultiAzStorage Problem = * As of 2023-09 there seem to be a fundamental problem with k8s and statefull workloads. == Problem 1 - POD and PV(Storage) not in same AZ == * For deployments this can be avoided by only creating the PV from the PVC once the pod is scheduled with Storage Class config {{{ volumeBindingMode: WaitForFirstConsumer }}} * Storage wait for POD to start and then get's created in same AZ * The real issue is when there is a server failure, or other re-schedule of pods, and they land on a different AZ than there PV(Permanent volume/Disk) * Example error in k8s event log {{{ Warning FailedScheduling pod/infra-elasticsearch-master-0 0/4 nodes are available: 1 Insufficient cpu, 3 node(s) had volume node affinity conflict. preemption: 0/4 nodes are available: 1 No preemption victims found for incoming pod, 3 Preemption is not helpful for scheduling.. }}} * If you can loose the pvc data, then just scale to 0 and delete the pvc, scale backup up and get new volume in correct AZ * If you can't loose data you have to move the pod to the correct az (Impossible for Statefullset) or move the data, backup, create new PV and restore.