Kubernetes K8s/Monitoring

Actual cpu usage, not requested

$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq '.items | .[] | select(.metadata.name|test("pod-a"))' | head -n 200

Verify the metrics server used in k8s

$ kubectl describe apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io

Events happening in cluser

kubectl -n kube-system get events --sort-by='{.lastTimestamp}'

kubectl -n kube-system get events -w

kubectl get events --field-selector type=Warning -w &

kubectl get pods -A -w -o wide | grep "^\|Running\|Terminating" &

Node cpu and memory utilization

# Get pod details  (swap requests for limits, or cpu for mem
kubectl get po --all-namespaces -o=jsonpath="{range .items[*]}{.metadata.namespace}:{.metadata.name}{'\n'}{range .spec.containers[*]}  {.name}:{.resources.requests.cpu}{'\n'}{end}{'\n'}{end}

alias util='kubectl get nodes | grep node | awk '\''{print $1}'\'' | xargs -I {} sh -c '\''echo   {} ; kubectl describe node {} | grep Allocated -A 5 | grep -ve Event -ve Allocated -ve percent -ve -- ; echo '\'''

or

kubectl describe nodes

Watch node events (& puts it in background on terminal)
```
kubectl get pods -A -o wide &
```

Get cpu metrics

# CPU for singel pod
kubectl top pod pd-691234f8-rabcw

# Get the metrics for all nodes
kubectl get --raw /apis/metrics.k8s.io/v1beta1/nodes | jq

# Get the metrics for all pods
kubectl get --raw /apis/metrics.k8s.io/v1beta1/pods | jq .

# Get metrics.
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/<namespace-name>/pods/<pod-name>

# k8s cluster auto scaler

kubectl -n kube-system describe configmap cluster-autoscaler-status
kubectl -n kube-system get      configmap cluster-autoscaler-status -o yaml

Add a sniffer as a sidecart to a pod

edit deployment and add

- name: tcpdump
   image: corfr/tcpdump
   command:
     - /bin/sleep
     - infinity

Capture console log lines on termination

terminationMessagePolicy: FallbackToLogsOnError

https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/#writing-and-reading-a-termination-message

Moreover, users can set the terminationMessagePolicy field of a Container for further customization. This field defaults to “File” which means the termination messages are retrieved only from the termination message file. By setting the terminationMessagePolicy to “FallbackToLogsOnError”, you can tell Kubernetes to use the last chunk of container log output if the termination message file is empty and the container exited with an error. The log output is limited to 2048 bytes or 80 lines, whichever is smaller.

CategoryK8sKubernetes