How to Check Status of MariaDB Galera Cluster?
How to Check Status of MariaDB Galera Cluster?
Form the MariaDB prompt you can check the status of write-set replication throughout the cluster. Status variables that relate to write-set replication have the prefix wsrep_, meaning that you can display them all using the following query:
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_%';
Checking Cluster Integrity
The cluster has integrity when all nodes in it receive and replicate write-sets from all other nodes. You can check cluster integrity using the following status variables:
How to Check cluster state UUID?
wsrep_cluster_state_uuid shows the cluster state UUID, which you can use to determine whether the node is part of the cluster. You can run this command on all the of your Cluster. Each node in the cluster should provide the same value. When a node carries a different value, this indicates that it is no longer connected to rest of the cluster. Once the node reestablishes connectivity, it realigns itself with the other nodes.
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_state_uuid';
How to Check total number of cluster changes that have happened ?
wsrep_cluster_conf_id shows the total number of cluster changes that have happened, which you can use to determine whether or not the node is a part of the Primary Component.Each node in the cluster should provide the same value. When a node carries a different, this indicates that the cluster is partitioned. Once the node reestablishes network connectivity, the value aligns itself with the others.
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_conf_id';
How to Check Cluster Size?
wsrep_cluster_size shows the number of nodes in the cluster, which you can use to determine if any are missing. You can run this check on any node. When the check returns a value lower than the number of nodes in your cluster, it means that some nodes have lost network connectivity or they have failed.
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_size';
How to Check Galera Cluster Status?
wsrep_cluster_status shows the primary status of the cluster component that the node is in, which you can use in determining whether your cluster is experiencing a partition. The node should only return a value of Primary. Any other value indicates that the node is part of a nonoperational component. This occurs in cases of multiple membership changes that result in a loss of quorum or in cases of split-brain situations.
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_cluster_status';
When these status variables check out and return the desired results on each node, the cluster is up and has integrity. What this means is that replication is able to occur normally on every node. The next step then is checking node status to ensure that they are all in working order and able to receive write-sets.
Checking the Node Status
In addition to checking cluster integrity, you can also monitor the status of individual nodes. This shows whether nodes receive and process updates from the cluster write-sets and can indicate problems that may prevent replication. wsrep_ready shows whether the node can accept queries. When the node returns a value of ON it can accept write-sets from the cluster. When it returns the value OFF, almost all queries fail with the error:
ERROR 1047 (08501) Unknown Command
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_ready';
wsrep_connected shows whether the node has network connectivity with any other nodes. When the value is ON, the node has a network connection to one or more other nodes forming a cluster component. When the value is OFF, the node does not have a connection to any cluster components.
MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_connected';
wsrep_local_state_comment shows the node state in a human readable format. When the node is part of the Primary Component, the typical return values are Joining, Waiting on SST, Joined, Synced or Donor. If the node is part of a nonoperational component, the return value is Initialized. In the event that each status variable returns the desired values, the node is in working order. This means that it is receiving write-sets from the cluster and replicating them to tables in the local database.
SHOW GLOBAL STATUS LIKE 'wsrep_local_state_comment';
Checking the Replication Health
We will now try to identifying performance issues and identifying problem areas so that you can get the most from your cluster. You can monitor the local received queue and Flow Control using the following status variables:
wsrep_local_recv_queue_avg shows the average size of the local received queue since the last status query. When the node returns a value higher than 0.0 it means that the node cannot apply write-sets as fast as it receives them, which can lead to replication throttling.
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_local_recv_queue_avg';
wsrep_local_send_queue_avg show an average for the send queue length since the last FLUSH STATUS query. Values much greater than 0.0 indicate replication throttling or network throughput issues, such as a bottleneck on the network link.
MariaDB [(none)]> SHOW STATUS LIKE 'wsrep_local_send_queue_avg';
No comments