= k8s MariaDb Galera Cluster =

 * Links
   * [[k8s/MariaDbGaleraInitDb ]]
   * [[https://severalnines.com/blog/galera-cluster-recovery-101-deep-dive-network-partitioning/]]
   * [[https://bobcares.com/blog/mysql-cluster-vs-galera/|Mysql-vs-Galera]]
   * [[https://proxysql.com/services/support/|Commercial Support]]
   * [[https://releem.com|Releem SaaS mysql tunning with agent]]
   * [[https://medium.com/dba-jungle/make-mariadb-galera-cluster-auto-recovery-fb1ce1d89f09]]

 * Safe to bootstrap
   * https://galeracluster.com/2016/11/introducing-the-safe-to-bootstrap-feature-in-galera-cluster/

 * In case of a sudden crash of the entire cluster, all nodes will be considered unsafe to bootstrap from, so operator action will always be required to force the use of a particular node as a bootstrap node.

== Restore huge db to Galera/Mariadb - using single node ==
 * https://severalnines.com/blog/guide-mysql-galera-cluster-restoration-using-mysqldump/
 * https://galeracluster.com/library/training/tutorials/galera-backup.html
 * https://github.com/mydumper/mydumper - multi threaded db dump

== Restart - after orderly shutdown ==
 * Check for "safe_to_bootstrap: 1" in grastate.dat
 * see - https://github.com/bitnami/charts/tree/main/bitnami/mariadb-galera#user-content-bootstraping-a-node-other-than-0


== Restart - after hard crash of all nodes ==
 * all grastate.dat should now have :( "safe_to_bootstrap: 0"
 * Find node with last transaction committed {{{
mysql --wsrep-recover

# Look in logs for highest "WSREP: Recoverd position: 37bb-addd-xxx

# Pick the node with highest number and change grastage.dat  "safe_to_bootstrap: 0 -> 1"
}}}

 * k8s recover from hard restart by mounting pvc volume into temp container, and then manually editing /mnt/data/grastage.dat {{{
#!/usr/bin/env bash
export k8s_claimName=mariadb-galera-0
kubectl get pvc ${k8s_claimName} | grep "${k8s_claimName}\s\+Bound\s" || echo "# Didn't find Bound pvc ${k8s_claimName} in namespace"
kubectl run -i --tty --rm volpodcontainer --overrides='
{   "apiVersion": "v1", "kind": "Pod", "metadata": { "name": "volpod" },
    "spec": {   "containers": [
                {   "command": [ "bash" ]
                    ,"image": "docker.io/diepes/debug:latest",  "name": "volpod"
                    ,"stdin": true, "tty": true
                    ,"volumeMounts": [{ "mountPath": "/mnt", "name": "galeradata" }]
                }]
        ,"restartPolicy": "Never"
        ,"volumes": [{ "name": "galeradata"
            , "persistentVolumeClaim": { "claimName": "'${k8s_claimName}'" } }]
        ,"tolerations": [{"effect": "NoSchedule", "key": "kubernetes.azure.com/scalesetpriority", "operator": "Equal", "value": "spot"  }]
    }  }' --image="docker.io/diepes/debug:latest"

}}}

== HAPROXY liveness script for MariaDB Galera ==
 * https://github.com/olafz/percona-clustercheck

== MySQL (MariaDB) ram tuning ==
 * https://dev.mysql.com/doc/refman/8.0/en/innodb-buffer-pool-resize.html

== Error messages Mariadb/Galera ==

 1. "[Warning] WSREP: no nodes coming from prim view, prim not possible"
    or "[ERROR] WSREP: It may not be safe to bootstrap the cluster from this node. It was not the last one to leave the cluster ..."
    * Which means that no cluster primary node exists and it can't figure out if it should become primary.
    * recovery:
      * we could try starting the DB’s in parallel, or putting each to sleep and making the HealthCheck pass, while we manually follow the recovery steps
      * '''Boot strapping / recovery'''
        1. Delay restarts
           * update on the StatefulSets parameter readinessProbe under initialDelaySeconds from the default 30 to 300 (which is 5 minutes) to allow sufficient time to edit the impacted file
        1. Find latest db {{{
mysqld --wsrep-recover
}}}
        1. select the pod to boot first
           * Update grstate.dat {{{
cat /bitnami/mariadb/data/grastate.dat
# uuid: 2a651c5d-139e-11ee-8733-0eab9be77c14
# seqno: -1
# safe_to_bootstrap: 0
cd /bitnami/mariadb/data
sed -i  “s/safe_to_bootstrap: 0/safe_to_bootstrap: 1/“ grstate.dat
# Now delete / recreate pod to bootstrap
}}}